James Farrugia's Blog: October 2015

16 October, 2015

Java Programming Tutorial - Unit 2 - Methods and Variables

Let's start from where Unit 1 left off. During the first steps, we instructed our computer to write "Hello World!" to the display. Through that unit, we went through quite some material, despite it being so simple. In this unit, we'll cover more practical points rather than theoretical ones, so get ready to write some more code this time!

Variables

So, what is a variable? As the name clearly indicates, it is something that varies. A variable is just a label which you can use to store something. Let's say we want to store our user's name, we create a new label named username and assign the user's input to this label. A user types in their name and we instruct the computer to store the input somewhere which can be addressed using the word "username".

To better understand and appreciate how useful that little label is, try imagining having hundreds of such variables all without a human-friendly name. You d not need to go far, older languages had no such concept and used exclusively memory addresses.

With this knowledge, you can now think of your computer's memory as being a large room full of P.O. Boxes. Each P.O. Box may be referred to by its number. In modern languages you can give each P.O. Box its own unique name too, so it's easier for you to know what you're working with.

The next bit is theoretical, however its good to know about variable terminology.
Java is known as being strongly-typed. This strongly named description simply means that each variable can have one type, and one type only. If we declared our username variable as being of type text, it can only contain text. If we had another one for storing a number, it can only store a number. It's a restriction, but it's convenient. There are language that have variables whose types change during runtime, or weakly-typed. It's convenient too, but its easier to shoot yourself in the foot if you're new to it.

Let's make use of a variable in a more practical example. In this task, we want our text to be defined as a variable, rather than passing a direct value to System.out.println.

As you can see, the change is minor. In the new line, the only thing which may be new is the String label. A String is simply a series of characters; it is a type of variable which you'll find in the vast majority of programming languages.

Variable types

Now that you have declared your first variable, and hopefully got to understand the relation between the type of the variable and the content it stores, it is safe to introduce the list of primitive types in Java. As you now know, Java is object oriented and everything is defined as a class. This implies that every instance in our program is an object. However, this is not entirely true, since objects need to be made up of something. If we keep going deeper into what constitutes and object, we find that there are only 8 primitive types. Each primitive type is made up of some number of bits. These are as follows; afterwards we'll go through them:

boolean - no specific number of bits, but practically 1
byte - 8 bits
char - 16 bits
short - 16 bits
integer - 32 bits
long - 64 bits
float - 32 bits
double - 64 bits

As you can see (assuming you're familiar with bits, the basic units of information), all types are practically increasing sizes of numbers. No letters, no images, nothing but numbers. Later I'll explain how everything can be made from these primitives, but first let's see how we can organise them into roughly three categories.

First we have the boolean type. This can have just two values, 1 or 0. Effectively we use true or false in Java, and is mostly used for setting states and flags.

Next come the natural numbers. All primitives from the byte to the long can fall under this category. Values stored by these types cannot have any values after the decimal point. One thing to note about the char type is that it does store a numeric value, however it is treated as a character. Note also that it is a 16-bit unicode.

Finally we have the real numbers; the float and double. Double, as the name implies, is just double the size of a float. It is usually much more practical to work with a double unless you're working on a high performance system where memory is precious (not all systems have gigabytes of memory to waste).

Primitive types can be easily declared or have a value assigned to them. If you want an integer with a value of 10, simply enter:

int myNumber = 10;

Composites

Now that you have the most granular types, it is possible to mix and match to create more complex types. A composite is basically another name for an object. The String type, for example, is a composite. In order to explain this composite, we need to introduce another programming term; arrays. An array is just a contiguous series of memory cells, each containing a value of the same type. Java has native support for arrays supports defining new arrays during runtime (older languages did not support this directly). The next snippet shows how we can use an array of characters to emulate a String, albeit in a less practical way.

Unlike primitives, composites, or the proper name, objects, are created using the new keyword. The declaration also follows this convention:
Type myTypeVariable = new Type();

As you can see, there is the type, the name, followed by the assignment to a new instance of the class (or type). Note though, that the String is an exceptional case in Java and can be declared like a primitive. This is only an exception and does not apply to any other class.

Probably the String is not enough, so let's go through some more examples. Let's say we want to show a picture. What constitutes a picture? Pixels, the number of pixels in width, and in height. Width and height are just numbers. The pixels are an array of the Pixel object (so we also have nested composites). And after that, what is in each pixel? Three values for the primitive colours Red, Green and Blue; again, three numbers.

Let's define our own Picture type. First, we need a Pixel. Then we'll create a Picture and we'll find its area. Using this area, we'll set the value of the pixels in our Picture, since initially this is null.

Now we'll create the program "body". The main class this time will create the image, the pixel array, and print out the area. Note how we concatenated the text and a variable using the '+' symbol. I'll explain the operators later on in this unit.

So you see, pretty much anything can be reduced to a number.

Operators

As I mentioned earlier, I'll give an introduction to operators. These are not so complex so there is not much else to learn about them.

Operators are the symbols used in code, such as the '+', '-', etc. The plus can be used for concatenating anything. For example, let's say we have variables a and b. a + b could mean the following:
If a and b are primitive, the result is a primitive. If any of a or b is not a primitive, the result is always a String representation.

Other operators are only reserved for primitives:

The minus '-' used to subract;
The star '*' is used to multiply;
The slash '/' is to divide. The value is rounded if not of types float or double;
The percent '%' used for obtaining the modulo;
The hat '^' is used for bitwise XOR;
The pipe '|' is used for bitwise OR;
The ampersand '&' is used for bitwise AND;
The exclamation mark '!' is used for NOT;
The greater than '>' and less than '<', for...well greater or less than;
The double greater and less than ('<<' and '>>') for bit shifting;

You shall not be using many of these in the early days. However you should be familiar with the computing terms used here (such as shifting and bitwise operations).

Methods

We mentioned something about methods during the first unit, mostly trying to relate them to the methods in your recipe books. This time, we shall add more methods to our little picture program. At first it might seem like overkill to have too many methods for a simple task, but as your project grows you'll come to appreciate shorter and more frequent methods.

So, the first task - adding new methods. But why, what are they going to do? Imagine we want our program to accept a user input. For this task, we'll use methods that were written by others - we'll be calling those methods. Afterwards we'll break up our program into smaller methods so that later on we can follow better programming practice. In this case we'll write our own methods too.

The program

Our next task will be to add on to the Hello World Picture program. This time the area will be calculated by the picture, rather than us having to calculate it in our main program. We'll also let the user specify the width and height of the picture. This user input will need some processing, as we shall see next.

First, we'll extend the Picture class so that it can support its own methods. We shall call this PictureExtended to avoid confusion for now.

Next we'll upgrade the main program. As you can see, it has many more methods and the functionality is more granular. If we had two pictures for example, we could still call the same createPicture, thus avoiding duplicate code.

Static vs not static

Note how we put static in front of methods in the main class, while we did not put any in the PictureExtended. Now that we do have some methods and classes, it is safe to explain it.

Static methods are those methods that can be called without having an instance of the enclosing class. For example, we never declare a new System or Stream class, but we call println on System.out variable (which is of type Stream). This is because it is declared static. However, we cannot call getArea() on PictureExtended by itself. We must have a new PictureExtended and place it in a variable. We are then able to call it from the variable.

This is basically the difference between static and non-static; if it is static, it can be called without an instance of the enclosing class; However, it cannot access the non-static members of the class. Let's say we make the getArea() static, in that case, we cannot access the width and height values of the PictureExtended instance.

Accepting user input

We are able to accept user input via the Scanner class. Again, this is just like System, a class already provided with Java (although we had to create a new Scanner, unlike System). Note how we passed System.in to it, telling it that we expect to receive input from the standard system input; the keyboard.

The difference from System is that Scanner resides in what is known as a package which is different from ours. We will go through packages in a later Unit, however note how we needed to import the class. The import statement has to be at the very top, outside the class declaration.

Conclusion

This was quite a long unit and covers quite a lot. We created new classes, instances of these classes, or objects, static methods and imported some other ones too. In the next units we shall go over further interesting bits of programming in Java, such as loops, cases and conditionals.

Other tutorials (which are just as good or better) may hold off explaining the details of classes and objects initially. I believe that this might send off the wrong message about Java. It is understandable that it is initially complicated, however it will embed the idea that in Java one should follow an object oriented methodology, otherwise the code will not be up to standard. Not that it is incorrect, but as projects grow, not following conventions will make Java very frustrating.

So, as a precaution, I'm giving out fairly detailed descriptions of why classes and objects before going further into the traditional loops and conditionals. Hopefully the descriptions coupled with the actual code will make it more natural.

Thank you!

10 October, 2015

Java Programming Tutorial - Unit 1 - Hello World!

You've probably already seen this little "hello, world" thing somewhere on the Internet. It is the most popular phrase to write to the display when learning a new programming language and has been around since the 70's.

Before we start heading into the development part of this unit, we shall install what is known as the Java Development Kit (JDK). The JDK is an excellent set of of tools that includes compilers, runtimes, libraries - a lot of tools and buzzwords. It's enough to know that it is necessary if you plan on programming in Java.

Installing the JDK

A explained in Unit 0, the JDK is compiled for every platform and architecture. As a result, you'll have to select the JDK for your system. I would assume you are running Windows, but I'll also consider Linux. If this is already too much, just follow the steps which I label as for Windows (although I'd suggest you read up a bit on Operating Systems and general computing before proceeding).

Pre-flight checks

Before you go through a 100MB+ download, make sure you do not have a JDK installation already. To check if you do, follow these steps:

On Window (or if you're unsure) press the Windows key (the flag icon on your keyboard) and R simultaneously. The run dialog will open up and in it type cmd, which will open the console. On Linux or UNIX systems, open the terminal as specified in your distribution (I expect you know this by now).

In your console (whether it is Windows or Linux), enter javac -version. This is the command for the Java Compiler, so no there are no typos there. If you get some meaningful output (i.e. a version number such as javac 1.8.0_4), then you can skip this installation part. If you get an error on the lines of "not found", then you'll need to install the JDK.

Downloading the JDK

Selecting and download the JDK

The JDK setup is trivial; download it, and install it. That's all there's to it. So first of all, head to Oracle's website and select your version. If you're totally unsure, just select Windows x86. In case you're simply not sure if your system is 32 or 64 bit, do the following (if you system was made less than 4 years ago it's probably 64 bit):

Windows

Windows architecture

Right-click on "Computer"
Select "Properties"
Under "System", the "System type" tells you whether it is 64 or 32 bit (refer to image).

Linux/UNIX

Open terminal;
Type uname -i
If it is i586, i686, or any 86, then it's 32 bit. If it's 64 bit you'll get x64.

A note on Ubuntu and its derivatives

If you're working on a Ubuntu (or Mint, elementary or other derivatives) an excellent and very short guide can be found on webupd8. I suggest following that guide for JDK on Ubuntu.

GUI Installation

Once downloaded, it is only a matter of running it and clicking next, however this setup is unfortunately bundled with unwanted software too. So before hitting next, make sure you uncheck any field that tells you to install toolbars or whatever. These are absolutely unnecessary and are included only for marketing.

Installed

Now that the JDK is set up we're ready to do some work. Despite the elaborate system, the development kit is a very simple to install. From the installation comment, "The Java Standard Edition Development Kit includes both the runtime environment (Java Virtual Machine, Java platform classes and supporting files) and development tools (compilers, debuggers, tool libraries and other tools)" so as you see, its a great platform to work with.

Make sure everything is fine by again running javac -version. This should print out the version number of the Java compiler.

Hello, world!

Everything is now in place and all that remains is your first class! First what? Java is a pure object-oriented programming language (with brand new elements of functional features since Java 8). These are a lot of buzzwords for now, so I'll keep it simple and then elaborate on these as we go along in the series.

For now suffice it to say that everything you do in Java in contained in what is known as a class. If you're interested, read up on object oriented programming, but for the first few units we'll keep it low.

Our first program will look like the following. I'll explain each line afterwards.

Comments

The first thing to note is the fairly natural language in this snippet. This is the easiest concept to grasp. Comments in code help make your code more understandable and easy to follow. It is of utmost importance to document your code especially when your projects get larger. One day you'll just leave it out and when you look at your code after a month you'll regret it - so it's better to get used to it right now.

Comments can be identified by being wrapped between /* and */ (a single star, I'll explain the double ones in the code too). Alternatively, for a single line of code, it is enough to start it with //. Java is adamant about standard and correct coding and documentation, so much so that a particular category of comments are know as JavaDocs.

JavaDocs are basically the multi-line documentation blocks (those between /* and */) with a very small difference and requirement. JavaDocs have an extra * after the opening /* and are to be written in specific areas.

After defining the other basic parts of the code, I'll go into more details on JavaDoc. For now, it is fine to understand that we can write normal text in our code to help us understand what is happening. Note that the compiler will ignore your comments, so complaining in code is futile :P

Class

As explained earlier, everything in Java is defined as a class. Classes are a blueprint for an object in an object oriented system. For now, it is not that important however it is wise too keep this in mind.

In our case, we have just one class named HelloWorld. You may have noted the 'public' keyword. This will make more sense later on, so for now think of it as a requirement for your program to compile.

In Java the file must be named as the class, so our program here must be saved to a file named HelloWorld.java. This 'limit' actually makes thing much simpler - you don't have to remember a bunch of names for the same program.

Method

Methods are where we define functionality. Think of this as the method in your food recipe; it tells you how to put things together to get something done. In an object oriented system, objects (which are instances of classes, I'll explain this soon) are made up of values and methods. Methods operate on these values to return some other value. This concept will be explored in the next unit, however it is important to note that these values are called variables.

But let's get back to methods. In our case we have just one method, named main, and in it we explain what to do to print our "Hello World!". In Java a method named main is the primary starting point of the program. Think of programs as a water hose - the main method is the point at which the water starts flowing; the origin. This main method, though, has some extra details which we will include but will be explained later on in the series. As you can see, we started it again with public, followed by static and finally void. It also has a String args[] in the brackets. Let's dissect this declaration:

public static are keywords for the VM which will be explained later in the series
void is the return type. Remember from the definition we said that methods work on variables to return a value. In some cases, there is no value returned by the method after it runs. In those cases we say that the method does not return a value, so the declared return type is void. Return values, etc are not important for now, but I'm mentioning the terms so you can get used to such concepts in context.
main defines the name of method. For now it is best to use unique names, however as we'll see later on, we can use duplicate names with some limits. You'll use the name to call the method.
(String args[]) is defining the method as taking one parameter or argument. Arguments are given to your method when it is called. In the recipe book, think of it as the book instructing you to put 100g of flour in the bowl. The 100g is a parameter to the method "put flour". This defines context and extra information for the method to work on. The method can access the parameter as if it were a variable.

System.out.println

This might seem a bit complex but let's analyse it like we did for the main. It is good to go over this again later on after we cover more topics, so you can better 'get it'. It is OK and expected that you will not understand the specifics right away. However, we shall go through the line:

System is a class, just like our very own HelloWorld. This class is provided with Java, so we did not have to write it. There are various methods and variables in this class.
out is a variable, or member, defined in the class System. It is not important to know the specifics, but the name is indicative enough as pointing to the output of the program. So up till now we accesses the output of our program from the System class. Note that a variable can also be another class. What you need to recall now is that a variable is an instance of a class (i.e. an object, whereas the class is the type of the variable).
println is, finally, a method inside the out class. This is the one which does the writing, and as you can see, we gave it a parameter, which is the text to write.

As I said, try to get the idea, but for now we'll keep it very simple and it is enough to know that System.out.println("my text"); will print "my text" to the output.

Structure Summary

So let's wrap this up after which we'll compile and run our hello world!

Recall that in Java everything is a class. Each class defines methods and variables (we haven't used these yet). Variables can be other classes too. When we create an instance of the class (which we haven't yet neither), it is known as an object.

In our basic case, we have just one class named HelloWorld with a single method named main. Here's the pattern now: the JVM is calling HelloWorld.main(). It does this behind the scenes and as you can see it is identical to the way in which we called System.out.println(). The pattern is <class><dot><method>. It is possible and normal to have multiple classes in one call, such as the System.out.println, which has two classes.

Going back to the JavaDoc, you can now see how the @param args is referring to the parameter passed to main. So what we are doing in JavaDoc is explain the use of each parameter in a method. Note also how the JavaDoc blocks are explaining the building blocks; the classes and the methods that we define.

Again, it is not vital to know these details yet, however as we go along they will start making a lot more sense.

Compiling

Compilation is fairly straightforward in Java. Let's go back again on this process as explained in Unit 0. Compilation in Java converts our code into byte code. We'll use javac to accomplish this, after which we will run it using the java command which will fire up a JVM to execute the byte code.

So, in order to compile, open up your console or terminal and navigate to the directory which contains your HelloWorld.java. For example if it is at C:\Users\james\mycode, enter cd C:\Users\james\mycode. The same goes for Linux and UNIX systems.

Once inside the directory, enter javac HelloWorld.java

This will not do much other than compile your code silently and that's it unless you have some compilation errors. You should note a new file now, called HelloWorld.class. This is not a source file now but an executable for running in the JVM. That's all there's to compilation in Java, so now onto the best part of this unit - running the program.

Running

Running your program is even simpler than compiling it. All you need to do, in the same directory where you have the HelloWorld.class, is enter java HelloWorld in your console. Note that we do not add the .class extension. We are running the class, not the file per se.

If everything went well, you should see your first code running perfectly on your system, shouting Hello World! at you! Do not underestimate this simple code. It's where almost everyone began. It would be ideal to experiment a bit, that's the key for your success. Note my explanations, but doing extra reading will help you grasp concepts which you may not have correctly understood in your first reading.

Conclusion

This is your first step in a very long and never ending journey. Do not expect that a few years will go by and you'll be done learning. Programming is a very active field and it is best to keep looking for new concepts, languages, methodology, etc. But this is the vital first step.

The code for this unit may be found on the github repository.

Soon I'll be putting up Unit 2, where we shall be going into variables and more methods. Some text in this first hands-on unit may be disorientating for the absolute beginners, but do not give up. In a few weeks you'll be much more proficient in Java!

09 October, 2015

Java Programming Tutorial - Unit 0 - The Basics

Unit 0? If you're new to this world, it might seem odd, but you'll see why we start 0. If not, well you probably might skip this post for now.

If you decided to stay and read on, then welcome to the world of computer programing, where media reports are exaggerated and computers are very dumb ;)

A brief intro to programming

Cheesy Java code

Programming is "the act of instructing computers to perform tasks". Computers don't get it when you tell them "write text". What they do understand however is a series of bits, which ultimately lead to the text being written.

I'm not going into the great detail about the origins of programming, however the following is a very brief overview to put you in context.

The first computer programs consisted primarily of punch-card. These were the earliest forms of bits, holes to represent 'on' or '1' and solid wood for 'off' or '0'. As time went by, digital valves were used and nowadays we use semiconductors and integrated chips. The concept has remained the same though. Writing ones and zeros is of course complex, so much so that no one has ever actually programmed in ones and zeros. What they did do was devise a system to write meaningful text and then convert it to ones and zeros. For example, the earliest code, in Assembly Language, would have looked like this

It is complicated, unless you're an engineer in the 50's (although it's fair to note that it is still used today for very specific reasons). The development of programming languages was just like this exact case. A language becomes too unwieldy (projects gets larger and reaches the practical limitations of the language), so a new higher level language is created to cater for new features.

The next language then uses a single command to represent a group of lower level commands. As you might expect, it is vastly more complex than just wrapping the lower level, but you get the idea. A tool that converts the high level language to the lower level is called a compiler.

Once computers became mainstream, more and more different kinds of architectures were created. An architecture is a CPU design which usually has its own machine language (the way 1's and 0's are organised for it to understand). As each CPU typically understood different instruction sets, different compilers were written to "wrap" architecture-dependent code.

Now the next problem was that it's not as simple as writing the code once and compiling it for each architecture. The code usually had to be modified for each system it was intended to run on, so portability was lost. Having a large project would render this process impractical so a new language came along that tried to solve this problem once and for all.

Java

Java was a language developed in 1995 by what was once Sun Microsystems (now Oracle). What's interesting about Java is that it is much more than just a language. Java is a whole ecosystem, having the language syntax, compilers, environments, SDK and community.

Duke, the Java mascot

But how did it solve the portability issue? As I mentioned, it is a whole ecosystem so there is a more elaborate system at play. What Sun did was create a language that is then compiled to a byte code rather than machine code. This byte code is then executed by the Java Virtual Machine (so, yes, it's still machine code, but a different kind of machine). This JVM is the only part which is differently coded and compiled depending on the architecture.

So we now have a language which can be written and compiled just once and being confident that it will run on any kind of CPU as long as the JVM exists for that CPU. This VM is also known collectively as the Java Runtime Environment (JRE), of which the byte code interpreter plays a major role.

Along the years Java was prominent on the web in what are knowns as applets. Nowadays with the emergence of HTML5 and JavaScript (which has a relation of 0% to Java, so don't get confused), applets have become a thing of the past. Java is also popular on desktop applications, mobilephones, TV set top boxes, DVD players, etc.

Java today

Java has a very strong community and many standards go through what are known Java Specification Requests (JSRs ), much like Request For Comments (RFCs) if you're familiar with network protocols. Basically this is a process for definitions of ideas, standards, protocols, etc.

Through this process, Java has become arguably one of the top languages for high end websites (technically known as web apps). Twitter for example, runs on Java, so you get the idea of the strength of Java. Throughout this series we shall cover, quite in depth, how to write enterprise web applications in Java.

Java used to, and still does in a revived way, dominate the mobile aspect to. This sheer adaptability, from top range servers to mobile phones, without doubt, makes Java the most versatile language ever. In the early days of smartphones, Symbian was the king of mobile operating systems. It used to run a version of Java known as J2ME (Mobile Edition). Today the Operating System with the largest active user base is Android. Surprise surprise, apps written for this OS are also in Java and use almost the exact same tools - it's a bit more complex - but we'll see as we go along how seamless it is to adapt your Java code to it.

Next Steps

So now you have a very basic idea of what programming is and how Java relates to it. Of course, a lot more resources can be found elsewhere if you're interested in more history and details. Wikipedia is one of those sites so you can head over there to further whet your appetite.

Following this post we shall start off with a basic Java environment set up; from the quick installation to your first Hello World!

Tutorials on Java and related subjects

Like many others in the field of IT, specifically software development, my passion for coding started way before I even considered looking for a job. I was probably 13 or so when I wrote down my first few lines of code in Pascal and the moments of glory in class when my mates looked in awe at that spanking grey-on-black "Hello World!".

Could have launched a satellite with this!

Many things have changed since then, but coding has remained a central part of my life...mostly because it pays my bills (more than that too :) ). One thing which got me to this point is the Internet community. I wouldn't have been able to write the second line of code had it not been for that tutorial on some obscure website. Of course, teachers and lecturer played a big part - they are not to be underestimated. But once you're out the door, the tutorials on the Internet are your "only hope". This only hope, though, is a treasure trove full of resources, from zero to hero.

So, after all these years I now feel able, and willing, to contribute back to the awesome community. My experience is vastly Java and so I hope I shall contribute valuable information to those aspiring Java programmers. I shall start publishing a crash course in Java, starting with the venerable "Hello World!" and ending up who knows where. I'm aiming at enterprise Java, but we'll see. The target audience would be hobbyist, student, and even professional developers, but having of course a basic understanding of a computer.

Thanks for coming by and I hope to welcome you again for the Java Tutorial Series!

08 October, 2015

Accelerated Mobile Pages

Browsing the web from our phones is nowadays a common thing. In fact it is now likelier to browse from your phone than from a desktop computer. Personally, I find myself using a desktop browser only while I'm at work or while doing some desktopy thing (such as coding or messing with VMs and networks). If I'm just browsing during the evening, for instance, its 99% from my phone.

My preferred way of browsing is via the forum kind of applications, such as reddit or hacker news, so at that point I'm not really using a browser. However, the majority of the content is delivered from websites so you see and interesting title, tap on it, and the in-app browser or the main browser is opened. This typically works fine, until the site you're accessing is a megalith and takes tens of seconds to load. After at most 3 seconds, if barely any content has loaded, the link is forgotten and I move on the the next link. That's it.

The problem is that these websites are offering too many features for them to be practical on a smartphone. Sometimes websites take even longer because they need to load the comments section, then come the suggested posts, with ultra big resolution images, followed by the author's biography... It's unnecessary, I just want to see content.

A team of internet companies, including Google, have come up with Accelerated Mobile Pages (AMP). It is primarily a technological development (not exactly unheard of, as we'll see), but through its restrictions it tries to limit the amount of unnecessary crap on pages. As I said, it's a development, however much of this development is in terms of standards and rules rather than faster networks, or something like that.

In fact ,the focus is on basically banning a whole bunch of heavy and also some outdated HTML elements. Unsurprisingly, no more <applet>, no more <frame> and no more <embed>. There are also strict limitations on JavaScript, however the most surprising (but great) banned elements are <input> and <form> (with the exception of <button>). It may not directly impact immediate performance of HTML, but it will surely stop developers from adding useless "post a comment" forms.

The focus is primarily on immediate content. If I get a link while chatting and I open it up, I don't have more than 3 seconds to read the title and move back to the chat. Thankfully, on Android, this experience shall now improve with the new chrome tabs introduced in Marshmallow. It's a technical thing, but basically it avoids having to use either an in-app browser (which is isolated from your standard chrome) or opening up chrome (which is slow).

Chrome tabs are much faster, at least in this demo (via Ars Technica)

But let's get back to AMP. As I said, it is content that the majority wants, so in this age of platform webapps, single-page sites and all the rest, simplicity, again, trumps features. Despite the lack of features, static areas of a website are hugely important. If you're interested, here's a short how-to, however it is fair to note that static this time is mostly client side, so no JavaScript - which means you'll probably need server-side processing if you have "dynamic" content.

AMP avoids the common JavaScript the web is used to and realises the idea of Web Components. These do have JavaScript under the hood, but since they are managed differently, it makes the page load faster without synchronous blocks by JavaScript. AMP also restricts inline styling, conditional comments and some CSS attributes (although CSS is not so limited compared to JS).

As yet, (being days or hours since being announced) I personally do not consider this as a major breakthrough technologically - it's only a set of rules to reduce the bloat on webpages who primarily host content. However, I am very glad with the way things are going and I do hope it gains traction.

The benefits I see are greatly improved user experience with much faster load times and no nonsense web pages along with better development. The more modular the pages, due to web components, the easier it is to develop. There are no messy inline styles or randomly placed JavaScript. Things are put in their place and the rules are strict - otherwise you'll not qualify for AMP and your page won't make it to the top of search results.

Unfortunately, I don't have that much control on this blog, otherwise I would have AMP'd it right away!

For further details, there are quite some resources:

07 October, 2015

The Volatile Security of Volatile Memory

I forgot about yesterday...

It is the black box in every system, even our brain. Volatile memory goes by many names, working memory, temporary memory, RAM, even just memory. Whatever your preference, when you mention it you're most likely referring to the area of a system in which data is stored for a relatively short period of time until it is used and then discarded (or transferred to persistent storage). One cannot possibly imagine a system without some form of memory; even if it is the same are where it is stored permanently, there is still some area used for temporary calculations.

Among the major differences between RAM and persistent storage is that RAM typically contains data about the processes that are currently in execution along with the data we are working on right now and will be discarded soon (yes I hear your screams, persistent storage does that too, but it also has data that we haven't looked at for months). Along with this fact, hard disks enjoy the possibility of being totally encrypted. They cannot be read unless the key is provided. This is not possible in RAM, primarily because the CPU cannot work with encrypted commands.

I do not mean that the CPU is not able to process encrypted data and convert it to plain text, what I am referring to is the inability of the CPU to understand encrypted commands (opcodes) or work on the encrypted data as data rather than a decryption payload. Let's say we have the binary value of 13 = 1101 and we want to add that to 5=101. Our simple XOR encrypter will give us the values 0111 and 000 for the keys 1010 and 101 respectively. Adding 0111 and 000 does not give the actual result for 18=10010. The values have to be in plain text before actual processing. XOR is simple and integral to CPUs so it is the simplest operation for it to decrypt the values. Once decrypted it is then possible to add the values.

But here is the problem - where is the key stored? Of course, working memory. What is the point of encrypting the data in RAM when the key is in the same RAM? What is the point of encrypting RAM after all?

Boom!

We encrypt disks because they can be removed or because they are portable, yet still contain data, unlike RAM which hold it until we turn off the system (or a bit longer if you're into memory freezing and forensics). So, we think, RAM is inaccessible to would-be hackers. Or so we used to think.

Recent research by various people and organisations (Sophos, Brian Krebs, Volatility Labs, among others) have identified a simple and small malware that simply looks up processes, maps their memory regions, copy paste and onto the attackers server for them to enjoy. And by the way, the kind of data was not you're ex's text messages, but the PIN to you credit card, so it's a bit more expensive I would say.

Use only for great dinners.

I did my own research (and eventually BSc. thesis) on this subject, and it is quite scary knowing that the very heart of your system may be so easily compromised. What's worse is that when you enter your PIN into any other system on which you have no control...God knows what's running on them and where your data goes. Anti viruses barely have an idea how to capture such an attack, and neither do firewalls, internet protection or whatever you have. If they did, they would block your debugger too, because that's how it works - like a debugger. It's like a kitchen knife used for a murder - you cannot ban knives.

Here's a short and sweet step-by-step on how you can scrape your memory. It's not intended to attack anyone, and it wouldn't be easy any way. It's successful only if your target cannot protect their networks and you manage to get in. The sample was done on Linux; Windows would be totally different but still very possible (the Target attacks were in fact on Windows). So here it goes:

A dummy little program was written in C. All it did was store a username and password (entered using getpass() for increased security) along with a series of credit card numbers that are "swiped" into the system.

Swiping cards

We then find the PID of this running process just by ps aux | grep scrape (the program is named scrape, but it may be something like POSSwiper for example)

Getting the PID

Now we can get all the memory regions and maps used by our processes. The /proc directory gives us a hand there.

/proccing to analyse the memory

We are interested in the heap space of our program which shows up nicely in the fourth line; ranging from address 009580000 to 00979000 (both hex). Next thing we do is fire up the actual scraper (which is, in our case, a kitchen knife. A legitimate gdb debugger).

Dumping memory in just one line!

GDB shows a bunch of text; we're only interested in how we started it (gdb -pid <PID>) and how we stole the memory (dump memory <to where> 0x958000 0x979000) As you can see, using the exact heap space memory range we got from /proc. The memory will be dumped to the file we choose. Of course, this requires administrator rights, but as one might expect, tens and hundreds of POS devices will most likely share the same password, and will probably have the default one too (such a typical case of a security breach - I found the password to my ISP's router on a public forum...).

Now, onto the next step - the analysis, if you call it that. Data dumped into the file is from RAM, so as expected it is binary. Linux simplifies this analysis by providing another tools - strings. All it does is see what's in a file and spit our all the strings it could find. That's it, so we pass the dump to it and we get a nice list of string, including the password (you didn't see it in the first screenshot because of getpass()) and all the numbers and everything.

The gold mine

That is all. Now go and whitelist the list of processes on your system, before someone gets to scrape the memory off it.

06 October, 2015

Some points on the Android UI

Android is a great OS - there's no doubt about that, even if you measure that statement using the number of active installations. It has an interesting history, starting from plans to create an OS for digital cameras and ending up being Google's core mobile platform running on around a billion devices. It is technically well designed, open source and very adaptable; from CPU architectures to screen sizes, Android can adapt.

Progress of Android over the years (Ars Technica)

As Android progressed to meet the expected standards of the day, the general UI got more minimalist while more colours were introduced (older versions looked darker). Despite the move towards a more modern UI in general, it is still possible for application developers to apply their own style. A typical result of this support was that developers of older applications did not bother updating their styles to the latest version (this is basically an XML file).

What we ended up with is a FIAT 127 in 2015's motorshow.

Not quite in the same league

The problem with this situation is that not only we have to sometimes use outdated applications, but Google is also pushing a new 'UI language'. There is nothing wrong in having a new UI language...except when few developers are following it, and you're not one of them. If Android is to have a uniform, clean and modern UI, there should be a mechanism which automates the transition of styles to the latest standard in cases where the default file was left lying around. Automation is not uncommon on the Android ecosystem - Code is checked for potential errors, style files must be up to standard, even copyright issues are flagged by a bot - so why not a simple style file?

What's the deal with this UI (SIM Tool Kit)?

As I mentioned earlier, there is also another problem which Google does not seem to want to fix - Material Design. Consistency is key in product branding and Google is/was known for their efforts in this regard. The ubiquitous bar in all their web products and their logo in the exact same location made it clear that this is a Google product.

Nowadays, their Android apps are cacophony of UI element styles and whatnot. Despite their efforts to make Material Design the next standard, it's already been 2 years and I have no idea when this next will be now. It can be seen in some apps, such as the settings, and the major Google apps. However, applications such as the Google Analytics still sport the Android 4.3 UI. Even worse is the app for Blogger - with UI probably designed by the Romans.

Yet again, even though the apps follow the general material design, all of them seem to have a language of their own. One aspect which was recently highlighted was the lack of consistent scrollbars. Now we got a new scrollbar in the application launcher too, for diversity.

Google Calender app has to be one of my favourites. It's fast, visually appealing and above all, useful. I use it regularly to set appointments, reminders, etc. just like all other users. The problem with the whole Calendar ecosystem is the web version. Why has Google introduced the Material Design, implemented it correctly in Calender on Android yet left the web version in the dark while at the same time it developed the Inbox service with correct material guidelines on both Android and the web (I'm not discussing whether Inbox is practical or not)? I understand the drive towards mobile and I truly appreciate the improved UI on mobile, but I'm not in favour of sheer inconsistency (and then again, there are web versions better than their apps).

Yes this was quite a rant - not really helpful for many. But it gets frustrating when you're working on your services and try to follow as many guidelines as possible to make your users happy. Thankfully apps are not accepted or rejected based on their looks and interaction, although sometimes I do favour such a system as it does improve the users' mobile experience.

04 October, 2015

Linux, Virtualisation and some Performance monitoring

P.S. This post is more of a noob's voyage towards better virtualisation on Linux than some professional guidance.

A few days ago I was facing an issue on my Linux machine where performance suddenly dropped to near unusability while the hard disk LED was on overdrive. My first thought was that there may be some excessive swapping going on. The problem was, though, how to identify what was causing this rather than what was happening.

Cheesy image for computers!

I could have guessed what the reason was since I had just turned on maybe the 10th VM on VMWare workstation. Despite this fact it was not immediately obvious which VM might be swapping rapidly or why it was doing so (there shouldn't be much memory usage during startup).

As yet, I haven't totally found out what it was but messing with some config files did the trick up to a certain point. First of all I limited VMWare to 16GB or RAM (out of 32) and configured it to swap as much as possible. I was led to believe that VMWare's and the kernel's swapping mechanisms weren't on the same terms, which ended up with me bashing (excuse the pun) the whole system. A few miraculous key presses took (Ctrl Alt F1) me to the terminal from where I could, at least, get a list of CPU heavy processes and kill them. Unfortunately it was not just vmware-vmx processes but also kswapd0 - an integral part of the system which won't easily allow you to kill -9 it. So basically this was the first indication of a memory issue.

After some googling I reconfigured swapping etc. but I wasn't able to replicate the issue and quite frankly I really did not want to spend 15 everytime to recover my system. So the process of finding a solution took days - not continuously trying to fix it of course. the best solution I could come up with was buying a small 50GB SSD and using it all for swapping. Apart from that I also set the vm.swappiness to a nice 100. The memory configuration on VMWare was set to swap as much as possible too. My idea was to allow everything to swap as much as they can since the disk was much faster now. Apart from that, I'd have as little occupied memory as possible.

I thought I'd start seeing a lot of fast swapping this time, so in case I got into the same situation again, it would be much easier to recover. In fact it did happen once again, but this time the system was under much more stress, so the extra swapping did help. This time I had a little script prepared so the 10 second long keypresses would not waste much of my time when typing in all the arguments. I used the following script to see what was hogging the CPU, network, disks - almost every possible bottleneck I could think of:

#!/bin/bash
dstat -cdnpmgs --top-bio --top-cpu --top-mem

Short and sweet, just calling dstat with canned arguments! Calling jtop is a lot shorter than all those arguments, that's for sure. Again, the result was swapping a swapping issue.

dstat however showed me something I was not really expecting. RAM usage wasn't really that bad, actually, just by looking at numbers it was great - less than 50% usage. However there were some more numbers and at that point I was not sure if I was actually using ~40% or 97%.

Reading up on Linux memory management taught me another thing. Linux is actually making use of much more RAM, however the bulk of it is caching. This cache is cleared when more memory usage is required by processes. Effectively I would see that there is less than 2-3% free RAM but that is not the correct way to read it. So there is some silver lining to this issue - I got to learn quite a lot more on memory management on Linux.

Following this result I started looking for a virtualisation solution that did not try to re-implement what the kernel was built to do. Not that I have anything in particular against VMWare or how it is implemented, but I was quite sure that the problem was originating from it. After a bit more educated reading on virtualisation, and a bit more courage to move out my (then) GUI-based comfort zone (few weeks before the said case I was mostly a Windows user..), I came to the conclusion that the Linux-based systems were potentially much better.

The logo is cool though

Here I introduced myself to KVM and Xen. Both appear to be more ingrained into the system and had potentially better memory management. I read up on some general performance and history of both systems and KVM appeared to have the upper hand. Being a more integral part of the Linux eco-system (and marginally faster https://major.io/2014/06/22/performance-benchmarks-kvm-vs-xen/) I opted to base my future VMs on KVM. I'm happy to say that I've never looked back since then and the impressive performance I enjoy on KVM is (on my system) unparalleled.

I'll let the kernel manage my memory

There is no particular conclusion here, except that maybe you should be familiar with your options before making decisions. I've got nothing against VMWare, as I said, I simply found something that works better for me. Management tools are far better on the VMWare side, but I'm satisfied with what VM Manager offers in terms of management and monitoring. Oh, you may also make use of the "script" I have. It's convenient when you need to see some performance details while not keying in some 5 arguments. I might write something on KVM next time, since it allows one to define many more options rather than a few clicks and done.

02 October, 2015

Hosting static websites on AWS S3

Hosting websites nowadays has become quite simple, easy and affordable. Years ago you would try to find free hosts which would allow you to upload a few files and you're done. GeoCities and freewebs were some of the most popular of these services. As time went by, the data centre landscape has changed dramatically and the current situation is big companies offering extremely cheap hosting services. The market is so huge that the classic "brochure site" has become almost free to host while still enjoying world class availability and scalability.

Static websites are the simplest form of site. These are just a set of HTML pages linked together via links. No complicated code, no weird configurations - just plain old files. Depending on what you want, such a site may be ideal (maybe you do all your business transactions over a facebook page and use the site just as a brochure).

This of course has the advantage of being simple, easy, cheap and can be up and running very quickly, including development time. It lacks, however, some of the major features you may want, such as a member's area, blogs and news, user-generated content etc. But then again, you might not want these extra features. In that case, here is a simple, short and sweet guide on how to host your site on Amazon Web Services. There is no doubt that it is currently the leader in cloud services and it would be wise to use their services.

1. Get A Domain

The first thing you need is the dot-com. You may use or favourite registrar or just randomly choose one from these that come to mind: noip.com, namecheap.com, godaddy.com. If this is your first time you may want to read up on registration etc. but all you need is to find a domain that is available, buy it, and configure it as explained later. Make sure you do not buy hosting with it, as some providers will try to bundle them together. Well you can do whatever you want, but it's not necessary in this case.

2. Sign Up with AWS

Log on to aws.amazon.com and sign up for an account. Choose your region, etc and keep going until you get to the main control panel.

3. Hold your horses!

The control panel may be overly complicated. It isn't, though. The number of services may be overwhelming, including the names, but we'll get through. Only two services are required in our case.

Cloud providers typically offer more than just simple hosting. Keep in mind that big enterprises are running their businesses here too, so this complexity is to be expected. One of the core offerings of a cloud provider is storage. Storage keeps everything in place - services need to save their logs, applications exist in storage, databases are persisted to storage...you get the pattern. Again, due to the enterprisiness of this offering, the storage services have their own terminology.

Your usual "hard-disk" or "USB drive" (or floppy disk) is known as a bucket. You have a bucket in the cloud in which you put your files. Amazon offers storage in a service known as S3 - Simple Storage Service. These bucket also tend to be dirt cheap. A site a less than 10 pages and low to moderate traffic may cost you no more than €1 a month.

4. Creating the site

Now that you know about the basic concept, it is time to create the storage for your site. In this example (and pretty much any other tutorial out there), we shall use the example.com domain. Whenever you see this written down, replace it with the domain name you bought. Do not prepend it with "www."; that is a subdomain, not the proper domain that you bought.

4.a. Sign in to https://console.aws.amazon.com/s3/;
4.b. Create a bucket named example.com;
4.c. Create another bucket www.example.com (with the www);
4.d. Upload your content to the first (example.com) bucket;

What we'll do is host the site on the example.com bucket and redirect any traffic coming in to www.example.com to it.

5. Prepare the site

Now you'll need to allow the public to access your buckets, otherwise they'll be forbidden from seeing your content (which, presumably, you want to be publicly accessible). All you need is to attach the following bucket policy to your example.com bucket. Again, make sure you replace example.com with your domain.

5.a. Set the policy
{
"Version":"2012-10-17",
"Statement":[{
"Sid":"AddPerm",
"Effect":"Allow",
"Principal": "*",
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::example.com/*"
]
}
]
}

5.b. Set up as static website by clicking on the bucket and select the Static Website Hosting section. Choose the 'Enable' option;
5.c. Name the index page. This is the "homepage" of your site. Typically this is named "index.html" or similar;
5.d. Test the page by entering the Endpoint URL shown to you in your browser, just to make sure it is accessible;
5.e. Select the second bucket (www.example.com) and in the same section choose to redirect requests. In the input field, enter your domain (without www.);

6. Wire it up

Another service that is required to properly route traffic to our site is Route 53. As you've seen, you endpoint is appended with a much longer address belonging to amazon. You wouldn't want to distribute that URL to your clients, after all you bought your own domain.

Route 53 is basically a DNS service - an internet directory for converting example.com to a number that the internet understands. You do not need to do any of this works, except how to inform the registrar about your shining new website on AWS. Here's how:

6.a. Open up https://console.aws.amazon.com/route53 and create a hosted zone for your domain (no www.) - Click Get Started Now under DNS Management, or just go to Hosted Zones and then Create Hosted Zone;
6.b. In the details section you'll see a Delegation Set - a list of addresses. Write these down somewhere, we'll use them later on;
6.c. Click Create Record Set and enter you domain name. Mark it as Alias and from the Alias target select your bucket. Do this also for the www.example.com (point it to its own bucket).

7. Finishing

Now that everything is set up on AWS, all you need to do is inform the domain registrar (the site from where you bought your domain). Remember the 4 addresses in the Delegation Set? These will now be used to configure the DNS addresses for your domain. What you need to do is log in to your domain registrar control panel and configure your domain. Somewhere you should be able to change the DNS settings for it. Not all providers have four fields - there may be more, there may be less. Enter all four addresses in the delegation set in the domain configuration. If there are less than four fields, that's it. If there are more than four, leave the rest empty.

8. Live!

Now that you're done, you may need to wait a few minutes until the DNS settings are updated. This is not related to your site on AWS but on the nature of DNS - i.e. people's ability to enter example.com and be properly taken to your site. These may take up to 48 hours, but in my case it was only a matter of minutes.

Hope you found this helpful!