What is Knowledge?

December 6, 2013

Knowledge is a domain that encompasses contradictory traits; “knowing” describes knowledge of true or false statements to be absolutely true. Knowledge can be classified* in one of three classes: knowledge that I know I know. This is the knowledge that I am aware of, and do know very well. Knowledge that I know I don’t know. This is knowledge that I am aware exists, but I’m unfamiliar with it or I do not know it to be unequivocally true. Finally, the knowledge that I don’t know, I don’t know. This is the largest class of knowledge, and it is large simply because I cannot be an expert on every topic, and what I know spans a small percentage of it. So, I am going to claim that no one can have absolute knowledge, instead, we can only have relative knowledge that can be viewed as subjective or objective.

I hypothesize that there should always be knowledge that I don’t know I don’t know, and that is because if I believe that I know of the existence of all knowledge (knowledge I know I know, and knowledge I know I don’t know), then that means that there is nothing that exists beyond my knowledge or awareness. Since this claim doesn’t contradict the fact that there exists at least one thing that I don’t know that I don’t know, both claims can coexist at the same time. This means that the latter claim can possibly be true and cannot be proven untrue unless there is a source of absolute knowledge that can be used as a reference to my knowledge level. Thus, there is always the possibility that something exists that I don’t know I don’t know, and that cannot be proven wrong unless all knowledge is finite. However, if knowledge is finite, that means there exists a point in knowledge learning that I can reach where there is absolutely no chance that I do not know what statements I do not know. And since I’ve already shown above that I cannot verify the existence of things that I do not know I do not know, there is no way to prove the claim that knowledge is finite. Thus, knowledge is infinite because no matter how much I learn, the claim that there exists at least one thing that I do not know I do not know is still valid, albeit and due to it being intrinsically unverifiable.

Another argument to prove that no one can have absolute knowledge is that knowledge can be many versions of the truth. Different believers in different gods would have different knowledge versions of what a god is. If I possess absolute knowledge, then I must “know” all the different versions, which means that I have to possess all possible opinions of any topic, and “know” them to be true individually. This contradicts being opinionated or knowledgeable on a topic when you possess all possible opinions. This last statement can be simply proven by stating that such an opinion (of all opinions) can be a form of one opinion, with which there exists at least an opinion that disagrees with it (all opinions that “know” one version would implicitly disagree with all other versions), which forms one more opinion that is not included in this universal opinion, unless the universal opinion also includes opinions that do not agree with itself! Which shows self-admission of a weakly formed opinion, which doesn’t rise up to the strength of knowing something absolutely.

What is knowledge? Knowledge is “knowing” something absolutely. It could be viewed as subjective as it can be polluted with one’s opinion, although to that person, her opinion is truth. “Knowing” one’s god doesn’t mean that that specific god exists. I can “know” anything I want because it doesn’t have to be absolutely true, although to me specifically (in my opinion), it may be. It is easy to prove that knowledge is not always true, by looking at all religions and how they cannot all be true, especially given the contradiction among different aspects of those religions. Furthermore, if knowledge was always true, then knowledge does not exist today because I won’t know for sure if what scientists discovered today would remain standing (remain true) in future discoveries and I won’t know for sure that my understandings of the nature of things around me would hold. Unless I can prove that, knowledge does not exist until the definition of knowledge is relaxed and it can encompass true and false statements as well. Knowledge has to include both true and false understandings of a phenomena. For example, “knowing” Jesus Christ was resurrected, to be true for one person such as a Christian and to be false by a Muslim. Thus, “knowing” that Jesus Christ was resurrected can be simultaneously true and false in the domain of knowledge.

Also, another argument that knowledge doesn’t exist if it has to mean absolute correctness or trueness (invariant across people), is that my knowing of my god, and yours of your god mean that neither of us possess any knowledge, given that there are two versions of it and both cannot be true at the same time. So, knowledge can be false and subjective, and can have many versions of varying truthfulness.

How is knowledge different than opinion? Belief? Understanding? An opinion is a formulation or a result of a collection of knowledge elements or facts. A person formulates an opinion after acquiring knowledge of a certain subject. An opinion leads to an established or fixed belief that is no longer subjective to deviation. Understanding of something defines a weaker knowledge in the topic. “Knowing” is a most powerful understanding of something.

Knowledge is relative and subjective. In Plato’s cave, one learns a lot about other objects around him by observing. Sometimes what you observe is a minimal representation of the actual object or its original behavior. Maybe I just see shadows of the real object, and the object although fairly represented by its shadow, is significantly different in reality. My knowing of the object may converge or even diverge as more characteristics of the object are revealed. Furthermore, since the learning experience cannot be provenly bounded, there is no way to determine its ceiling, knowledge can only mutate and never be fixed. The moment knowledge is fixed for all topics, the moment we solve all problems in the world, observable and unobservable, and reach absolute knowledge.

Is there such a thing as absolute knowledge? Can it be attained by anyone? Can it be described? What is absolute knowledge? I know what knowledge is (defined above). And I know that knowledge does not always have to be true, correct, or complete. And I know absolute knowledge to be “knowing” all versions of everything. If all knowledge can be quantified and identified, then it must also be assumed attainable, which means that all knowledge has to be finite. I have already proven that knowledge is not finite and can never be proven to be finite due to the need of a proof to address the following contradictory statements:

1. All knowledge can be reduced to knowledge that I know I know, and knowledge that I know I don’t know after some learning.
2. There exists at least one thing that I do not know I do not know.

Thus knowledge is infinite since the first statement above cannot be proven given the second statement being unprovable since we cannot prove that all the things that we are not aware exist, do not actually exist. This implies that knowledge is infinite, and cannot be fully attained, and our knowledge is relative, and can and will encompass absolutely true or false things.

* The three classifications mentioned above are borrowed from a presenter at the No Fluff Just Stuff IT conference in 2009.


Summary of Instruction Level Parallelism Limitation

December 5, 2012

Limitation of ILP

1. Code is not mostly parallelizable. Upper limit on parallelization according to Amdahl’s law. ILP speedup may need to be in the 5 to 20 factor to be accepted generally given the VLIW processor complexity that needs to be introduced to achieve that [A]. During simulations, even with increased hardware available to processor, programs did not achieve a corresponding linear increase in performance by exploiting ILP [C].

2. Data dependency. Either stall and waste resources (and have a simple hardware), or:
2.1. Data speculation [B] – complexity of handling predictions (serialized and non-serialized methods), handling spurious exceptions, memory disambiguation/aliasing, and misprediction undoing penalty.
2.2. Forwarding – does not completely avoid stalling, and increases hardware complexity.
2.3. Instruction reordering (compiler and/or hardware) [C]. This technique is used to remove data dependency mostly originated by waiting for a memory operation to finish (memory operations account for 30-40% of total code [C]). This approach introduces hardware complexity such as reorder buffers to allow out-of-order execution but in-order retirement of instructions [C]. Memory delays could also be shortened through the use of I/D caches. However, those present their own sets of challenges and limitations (limited size, enforcing data coherency across processor caches, managing efficient cache replacement policies overhead, etc.) Generally speaking, there are many ways to accomplish instruction re-ordering:
2.3.1. Compilers can tag instructions with dependency flags (all instructions this current instruction is dependent on) such as in dependence processors (Dataflow). This can also be accomplished by the processor itself without the help of the compiler such as in sequential (superscalar) processors (although they may also use suggestions from the compiler but will not guarantee using those suggestions) [A].
2.3.2. Compilers tag how many instructions within the last M instructions that this current instruction is dependent on (such as in Horizon processor) [A].
2.3.3. Compilers can group instructions together in traces (this will include moving and duplicating instructions in various basic blocks of execution to allow executing (early) instructions as part of a basic block that would definitely be executed regardless of control decisions between. More code size, but higher performance overall.
2.3.4. Compilers can use register renaming and loop unrolling to remove data dependency across iterations, and speed up execution (sending separate loops to execute in parallel), this is referred to as software pipelining [A]. This adds a trade-off between how many loops to get higher throughput versus increasing code size (many loops) when some of those loops may end up unnecessary (iterations end earlier than unrolled code). Software pipelining goes beyond loop unrolling, it also brings code not dependent on looping outside of the loop (some to the top [prologue] and some to the bottom [epilogue] of the middle which is truly iterable code [kernel]), and then the true iterable code will be re-rolled. This type of compiler scheduling is called Modulo scheduling [A]. This approach can also cause register spilling (more registers needed than absolutely necessary for program execution), condition prediction (we need to speculate on whether the loop would be executed at least one more time to unroll – static prediction), true dependencies in data used in iteration i after being written in iteration < i, memory aliasing issues (will some pointers in one iteration write to the same address as in subsequent iterations?), etc.

3. Control dependency. Either stall and waste resources, or:
3.1. Branch prediction – complexity of handling predictions, handling spurious exceptions, and misprediction reversal (reservation station, instruction caches, and instruction commits are used to allow this).
3.2. Simultaneous branch execution (branch fanout) – more functional units, more registers, waste of resources that otherwise could have been used somewhere else, more management of what gets committed, and what gets to be discarded, etc.
3.3. Compilers and hardware working together to add more delay slot instructions, reordering of instructions.

4. Optimizing and parallelizing programs depend on knowing which basic blocks (or portions of the code) would run more frequently than others – execution frequency of basic blocks [A]. This can be achieved via analysis of the control flow graph of the code through the use of profilers. This is not a straightforward process, and it is highly dependent on the workload at runtime.

5. Basic block ILP limitation [A] can be mostly attributed to (aside from data dependency already mentioned above) limitations of functional hardware units due to similar operations within the block (albeit independent). For example, if the processor has 5 ALU units available to execute 5 adds in parallel (same cycle), but a basic block has 6 independent adds, then we need two cycles to execute instead of one. That is why VLIW instructions will include differing operations that can execute in the same cycle rather than many of the same operation (dependent on what is available in the hardware). Furthermore, ILP could be limited to 3-4 parallelizable instructions out of around 10 in a basic block (high limit) [F]. This is just a limitation of parallelism in the code itself.

6. Naming dependency (except for true dependencies). This can be resolved with register [C] and memory (except for ones with alias problem potential) renaming which is usually done by the compiler [B]. It is still limited by register availability and management (besides potentially introducing register spilling issues, we may also run into bandwidth issues in register files due to more reads and writes on the expanded list of registers).

7. A big reason why code is hard to parallelize is address calculations [D]. Eliminating or reducing long dependency chains of address calculations via compiler optimized code is seen to increase ILP. The paper [D] refers to some techniques in other resources to address addressing computation optimization, so I will have to do more reading.

8. Most of the data dependency that limits ILP comes from true dependencies (read after write dependencies) because the other two types (anti-dependencies [write after read] and output dependencies [write after write]) can be managed with renaming [E]. Those true dependencies come primarily from compiler-induced optimizations to support the high level language abstractions [E]. The compiler will introduce a high usage of the stack to reflect activation records corresponding to function calls and private variables, introducing a true dependency that would significantly reduce parallelization [E]. The true limitation is not based on the time it takes to allocate or deallocate the stack pointer register for an activation record, but based on the long chain of dependencies introduced [E]. [E] shows that even if all dependencies were completely eliminated, leaving only those that update the stack pointer, the total performance gained is nearly unchanged (controlling the absolute limit for achieving ILP). Removing stack update dependencies, however, have shown to provide significant performance gains even compared to perfect prediction and renaming – use heap based activation record allocation instead of stack allocation (accept higher overhead of allocating to enable multi-threading and true parallel execution of program traces). Other suggestions may include the use of multiple stacks or switching between stack-based versus heap-based at compile time based on the depth of the stack calling chain (the deeper the call stack, the more benefit gained from using heap-based activation record allocation) [E]. Some papers show that by increasing the window size, more parallelism can be exploited, while [E] shows that while that may be true, “distant” dependencies (beyond the window size) cannot be exploited with out-of-order instruction issue by superscalars, and other methods are needed under a reasonable window size limitation.

Things to look for or read about

1. Window size adjustments. How about compiler-controlled variable sizes?

2. Perfecting branch predictors, very important since it has a major impact on exploiting ILP. Most papers I read have done simulations under unreasonable assumptions such as perfect prediction, unlimited resources, unlimited bandwidth, unreasonably low penalty cost for mispredictions, ignoring spurious exceptions, etc.

3. Handling memory aliases at compiler time.

4. Choosing stack based or heap based activation record allocation at compile time. Maybe even considering multiple stacks – addresses true dependencies introduced by the compiler via deep chain of function call dependencies. A major performance increase can be gained here.

5. Clock rate per operation variation to increase throughput for faster operations. This can potentially increase a low ceiling on potential throughput even for embarrassingly parallel operations on the same CPU.

6. Generation of per-thread-based traces by the compiler, taking into account shared versus dedicated on chip memory, possible proximity to shared caches, etc.

7. Can traces be concurrent rather than parallel? Allowing for concurrent execution rather than parallel execution (allowing values to forward or be written to shared caches rather than waiting for a complete trace to finish before another one to start even on separate cores).

8. Maybe enforce convention by the compiler to allow predictable address fanout (range of memory address) for given functions or programs. For example, for dynamically allocated objects, the compiler may enforce via hints to the hardware how far apart they need to be on the heap, which will allow the hardware to take advantage of locality when loading a cache line from memory. Those can only be hints due to memory allocation and page replacement strategies, but a cooperation from the hardware and hints from the software can increase this utilization.

9. Exploit the nature of sequence analysis algorithms to optimize performance.

10. A hybrid processor approach to realize ILP and DLP (combining VLIW/superscalar and vector processors).


A. Instruction-Level Parallel Processing. Joseph A. Fisher, B. Ramakrashna Rau.
B. Limits of Instruction Level Parallelism with Data Value Speculation. Jose Gonzalez and Antonio Gonzalez.
C. Exploiting Instruction- and Data-Level Parallelism. Roger Espasa and Mateo Valero.
D. Limits and Graph Structure of Available Instruction-Level Parallelism. Darko Stevanovic and Maragaret Martonosi.
E. The Limits of Instruction Level Parallelism in Spec95 Applications. Matthew A. Postiff, David A. Greene, Gary S. Tyson, and Trevor N. Mudge.
F. Limits of Instruction-Level Parallelism. David W. Wall.

The Grand Design (Part IV)

December 28, 2010

This is a continuation of my last post (The Grand Design (Part III)).

There are many other aspects covered in the book but I don’t see as necessary to state here to make the summary comprehensible. For example, the concept that if you have two theories arriving at the same conclusion, then neither one is more correct than the other. You could actually have multiple forms of the same real law and all forms are correct. This approach is called Effective approach, and is necessary to justify the variations in string theories do not necessarily invalidate them. Additionally, the book argues a model-dependent approach to understanding the universe where you cannot separate the observer from the observed (that nothing exists out of our observation capabilities). That the universe is behaving the way it is because we are observing it to be behaving that way (either through direct observations or indirect ones that come from convenience of such assumptions/observations to make the physics work). This seems to be a strange concept but take a simple example that was in the book. If you observe a sofa in the living room then you know it exists. If you then walk out of the room, does the sofa still exist? If you say yes because you just saw it one minute ago, would you bet your life that it wasn’t removed the moment you left the room (thus it doesn’t exist in the living room anymore)? However, while you are observing the room from the outside and notice that no sofa was taken out, you can “conveniently” assume it still exists because it makes your math correct and the other possibilities are not very significant. This is necessary to justify using M-theory (highly conceptual and unproved theory) as the basis for the conclusion of the book. Although it is unproved yet, it does make sense, and it has predicted accurately many other phenomenon.

Another less obvious but very interesting example is the two-slit with delayed-choice experiment. If you shoot a single particle beam of light through a board that has two slits in it, with another board behind it (where the light lands) you will see that the light exhibits a wave-like behavior because it doesn’t travel in straight line (each of the two slits act as a new source of light shooting light beams in various directions – this is called interference). Feynman explained this by saying that the single particle line did not just travel from the source to the final board, but it traveled to the board then back to the beam source through the other slit, then back again and back again to the beam source, etc. He explained that the beam had traveled every single possible path simultaneously as it reached the final board. Not only that, but the single particle actually interfered with itself as it traveled back through the other slit!! This caused interference! He then took the experiment further and added an observation point to see the particle the moment it passed through the middle board (with the slits). He observed that the light was not a wave, but a particle (he noticed a single particle traveling through the slit). But observing the whole system altogether (without the observation point on the middle board) showed the light to act as a wave and thus exhibited the interference behavior. Wheeler took this experiment even further and moved the observation point to right before the light hit the destination board (instead of observing the middle board). He noticed that when he started observing one slit from the destination board perspective, he saw the light going through that slit. When he changed the observation to look at the second slit he saw the light traveling through the second slit! When he stopped observing he noticed the light had traveled through both slits causing interference on the destination board! This has major consequences because remember we only have a single particle beam that was shot against the middle board. This means that this single light particle (photon) had the choice to travel through either slit one or two. When we did not observe the slits, the photon traveled through both simultaneously causing the interference effect! When we chose to observe slit one, we seem to have changed the photon choice of which slit to take AFTER it had already made the choice to travel through the slit we later decided to observe! When we started observing through slit two, we seem to have told the light to choose slit two to go through AFTER it had already traveled through slit two! Thus, our delayed choice affected the decision made by the photon in the past!

Those concepts of simultaneous existence and simultaneous position at every possible point in the universe (although only one path wins at the end with the highest probability, and this path is referred to as the Feynman sum over histories with every possible path being a history since it has already been traveled!) are very hard to imagine or to comprehend. However, those phenomenon have already been observed in experiments. Their explanations are theories, but they are the most probable theories because they are consistent in explaining the behavior and “weird” outcomes.

When I dove into those concepts mentioned above individually to understand them in more depth, I came to the realization that I had to accept them not because they make sense but because they provide reasonable answers to those weird observations. However, I couldn’t escape adding my own questions to how some of those things are stated or formulated. Since there are a total of 11 dimensions, before any universe is allowed to come from nothing into existence, this means that all possible physics laws (each of which is only unique per universe, but is contradicted across all others) existed in the same 11 dimension fabric. One of which is squeezed out into each universe as it comes out into existence. Does this mean that there was some sort of a “space” that has dimensionality, that already existed prior to any cycle of creating any universe? What is this space made up of? Did it have energy?

Another thing that bothers me is why do care so much about conservation of energy principles when the universe did not exist yet (for this coming cycle)? Why couldn’t the universe start from a plus or minus initial amount of energy? I know that religious people would love my argument because they would then conclude that for such “extra” energy to exist there must have been a super power to “add” it to the system as an initial state. While this maybe true, but then we are AGAIN using the principles of physics to say that this extra energy couldn’t have come without God’s intervention. But if physics did not apply then, then this makes God’s intervention, again, unnecessary to justify the extra added energy. Again, this is not to say that God did or did not exist. This is just saying that the universe explains itself. But like newton believed, maybe God creates the equation and lets the universe work using it without ever interfering with it (i.e. miracles). But that is outside the scope. The reason why I think my question is “extremely” valid is because that would explain all the extra energy that we have in the universe today that doesn’t have a matching canceling energy (to bring the total to 0). If the initial state if the universe was an initial +X or -X, then no matter how many virtual antiparticles you create that are equal in charge to existing matter, you will still end up with excess energy and that may be OK! Of course if this turns out to be true then those antiparticles may not even be necessary!!! Now, if what I am saying is true, that still doesn’t necessarily mean that all energies we came to know don’t need to cancel each other out or the observable balance of energy (a.k.a. First law of thermodynamics) is not true. This is after all artificial energy that was just transformed from existing energy and better be equal to what the object has lost. But that doesn’t mean the entire universe initial energy state couldn’t have started with an initial amount. I don’t understand why that is not possible or entertained.

Another thing that I don’t understand and would love to learn more about is, if objects cause dents or warps in the space-time fabric to cause gravitational effect on surrounding much smaller objects, are only certain objects allowed to make such warps? What is the minimum size or mass that is needed to cause such dent? If curvature is only relative, meaning that every object in existence will curve space around it but that depends on the mass of the object then does that mean that we, humans, can also cause curvature and instantly force smaller molecules to orbit around us? If no, why not if the ratio of mass between us and molecules is comparable to the difference between the sun and earth? If only “large” objects are allowed to dent space-time, doesn’t that imply that it is the mass of the object that transforms the dimensions of its environment and not vice versa? Think about it. Dents mean that you are creating a third dimension on a two dimensional space (think pulling a nicely folded tshirt upward to form a pyramid like shape, and you would notice that you just changed a two dimensional object (folded tshirt) into a three dimensional system). That means that the initial mass of the big bang particle (infinitely massive) is what forced the number of dimensions into existence (from a pool of 11 total dimensions). This has many implications. One of them means that as you travel close to the speed of light (and your mass approaches infinity), you will create your own dimensions similar to what the big bang particle did, and thus YOU will explode much like the big bang particle to form your very own universe!!

This is a very fascinating field. I hope to continue reading (and hopefully writing) about it 🙂

The Grand Design (Part III)

December 28, 2010

This is a continuation of my last post (The Grand Design (Part II)).

We know how to visualize a three dimensional world. With Einstein, we added time as a forth. Einstein argued that time does not exist on its own, and neither does space. They both exist together and relative to each other. As you reach the speed of light, time seems to go slower on your ship (time passes much faster with someone that is observing your ship). As time goes slower, distances shrink (you cover more ground per the same time period). This was stated in Einstein’s special relativity back in 1905. To simplify things, let’s assume you are on a plane. Let’s say you start walking one yard forward and that it took you one second to do that. For a person observing you from outside the plane, you actually traveled tens if not hundreds of yards over the same amount of time. Space and time are relative here to the observer and not absolute. So, Einstein came up with the space-time concept that shows that space and time co-exist and affect each other. That is our fourth dimension. As we reach the speed of light, time comes to a halt (the time dimension is “folded” away) and we only have the concept of space (no time), and this is exactly what happened during the big bang when particles were traveling at the speed of light. That is why it is laughable to start talking about what happened before or during or even shortly after the Big Bang because time is not absolute and it was totally irrelevant as all those events above happened “simultaneously” and not in a simple subsequent chain of events. As the universe expanded and started cooling off, things “slowed” down away from the speed of light, which initiated the time dimension as it became “relevant”. A few things I mentioned and I need to quickly address before you continue to scratch your head. First, you may ask, I thought we said that the universe was shown to be accelerating away rather than slowing down. Wouldn’t that mean if we start with a speed that is equal to the speed of light, wouldn’t that mean that now we should be way ahead of that? Actually, that is correct, but it is the space in the universe that is expanding and accelerating, not particles. Think of it this way, some very strong force may come to your neighborhood and start expanding the space between homes, which effectively expands the total area your neighborhood is occupying. This doesn’t imply that each home will be bigger. As a matter of fact, at the same time your neighborhood is expanding a home may actually become smaller if you were in the middle of downsizing your home (to save taxes on livable space maybe) by removing rooms.

Now, you may ask another question and say: what do you mean by time dimension being folded away? That is good that you asked this question, because it brings me back to my initial point about the multiple dimensions in the universe. I want to start by throwing a bomb first and say: M-theory states that there are a total of 11 dimensions (10 space and 1 time dimensions). after I leave you to scratch your head for a few moments, I come back and go on to describe how to even visualize what that means. Let’s say you have a box that exists in three dimensions (obviously). Let’s say you start pressing against the top of the box to reduce the depth of the box until it reaches very close to 0. You will notice that your three dimensional box is starting to look like a two dimensional square. Just because the height or depth of the box is not entirely and visibly clear to you, that doesn’t mean it doesn’t exist! Physicists call this “dimension folding”. We have folded the third dimension of the box until it appeared as if we only had two dimensions. The other thing to keep in mind is in a two dimensional world, we cannot even begin to comprehend what three dimensions mean (since we are not allowed to escape the xy coordinate system, there is nothing you can do to explain three dimensions to a two dimensional object. Keeping those two things in mind, now the idea that we may have 11 dimensions but all are folded except four (three space and one time dimension) does not sound “as” crazy as it did a few sentences above. Now, according to Einstein (through general relativity), gravity doesn’t exist! It is space curvature or dents that make an object exhibit what we know as gravity. The earth doesn’t rotate around the sun because the sun is “attracting” it. The earth rotates around the sun because the sun makes a dent or warp in the space-time “fabric” causing earth to fall right into an orbit around the sun. That is how Einstein explained why light seems to curve (or bend) around the sun to reveal stars that are located directly behind the sun. The light doesn’t bend, but its space-time path is bent. Think of an airplane traveling around earth in a straight line. From an observer’s perspective who is located on the moon, the airplane is going around in a circle and not in a straight line. To understand how an object can affect another object’s path by just curving the space around it, take an example of the game Curling. This is a game where a player slides a heavy stone down an icy path, and two other players skate on both sides of the stone and keep “sweeping” on both sides to induce curvature in the path to force the stone to slide left or right as it is moving down the path. They are not allowed to touch the stone, but they can affect its path by just sweeping around it causing a smoother incline in one direction or another to change the linear path of the stone to one that obeys changes in curvature.

Now, why did I start talking about space-time curvature? The Grand Design states that the curvature and the number of dimensions that are unfolded, together make up the laws of physics “for that universe”! There are a few assumptions here. First, that all natural laws observed by us come from very few (or just one) physics equation. Everything else was derived as an approximation to understand the behavior of the complex object (human beings, planets, etc.) The real law(s) that started the universe do(es) apply at the quantum level only since the universe started with quanta particles. The simple law(s) of physics that started everything else is (are) a direct consequence of the space-time curvature and number of dimensions. With higher number of dimensions, curvature is changed such that gravitational forces appear to be weaker. While with smaller dimensions they appear to be much stronger. With four dimensions that we have, forces are what we came to know and observe today. According to M-theory, there are 10^500 possible universes based on number of dimensions and their space-time curvature. The book states that ours is the most stable because with higher dimensions, gravitational forces would have been weaker and thus planets would have escaped their orbits (and earth would no longer have rotated around and enjoyed the heat of the sun). If they were stronger (less than four dimensions) planets would have collapsed onto themselves due to the stronger gravitational forces pulling objects together. That is why our universe lives longer and its “bubble” does not burst soon enough for us to not be able to exist. And that also explains “why” the physics laws are what they are.

So, does that mean we have found The answer to our existence? Not yet. There are so many assumptions that were used to come to the conclusion that the universe creates itself over and over again out of nothingness. Which means God is not “necessary” to allow this to happen. First, we are assuming the general relativity of Einstein is correct. It hasn’t been proven to be correct but it has correctly predicted many phenomenon in nature that nothing else was able to explain such as the bending of light with accurate equations, the unification of gravitational forces between big objects and small (quanta) particles, etc. It is also assumed that string theory is correct where matter and force is the same thing (this is necessary to explain how we can cancel the energy that exists in matter by assuming an equal energy exists in force fields that exist around the matter itself). That particles are not particles or waves but small strings that exhibit particle AND wave behavior based on how they vibrate. Strings were never seen in a lab simply because they are way too small to be observed using today’s technologies. It is also assumed that M-theory which encompasses versions of string theory is the correct set of equations that unifies all of string theories (there are many versions and explanations of string theory although they all arrive at same results). What stops us from “confirming” the statements made by the book (and those statements in no shape or form were initiated by the authors, but have been around for a long time), is the fact that we cannot use M-theory to explain how the infinite energy stored in matter is canceled out so we can reach at the fundamental 0-energy balance we were talking about. We assume it is all canceled out, but the only way to confirm this through the math is by proving that M-theory is finite ad solvable. By doing so, M-theory would not only have a mathematical proof, but it would have answered all questions about how the laws of physics came into existence. And if those theories labeled collectively as M-theory, can be reduced to a single equation, then that would be the one equation for everything that Newton, Einstein and physicists were chasing for centuries.

— To be continued here

The Grand Design (Part II)

December 28, 2010

This is a continuation of my last post (The Grand Design (Part I)).

Those virtual particle pairs are existent in theory only where it makes other physics equations balanced. No one has seen them, and if our physics laws are correct, no one will ever see one! The reason is because of Heisenberg uncertainty principle which states that you cannot pinpoint the existence of matter in space with high confidence since mass is energy and energy is mass, and they continuously move into and out of existence simultaneously. This is of course the simplified explanation. The more detailed explanation is out of scope and relevance at this point.

Advanced: Heisenberg uncertainty principle states that you cannot know both the position of a particle as well as it is momentum / velocity at the same time. The higher the certainty of knowing one of those two quantities, the higher is the uncertainty of knowing the other one.

Now, if we are past all the controversy of total energy being 0 in the universe, then it is easy to explain (at least mathematically) that you can have no universe at all one second, and all of a sudden you have one with -X and +X energy coming into existence which is OK because they add up to 0 anyways so we haven’t introduced anything new into or took anything out of the system. Now, we solved the existence from nothing, and collapsing into nothing, in an endless cycle. How does this apply to the big picture (universe) as opposed to just referring to the individual elements of it (particles)? The Grand Design states that the universe did not know anything about big and complex objects when it came into existence. It was all about particles (quantum physics as opposed to classical physics). So, the only laws that applied were quantum physics laws (laws that apply to small matter rather than big matter). It just so happened that our universe “lasted” beyond a critical point where complex life developed as we know it today (as opposed to immediately going back into nothingness). I am going to summarize billions of years worth of events into the next paragraph.

So, the universe was condensed in a very small particle about 13 billion years ago (we know of that value from various observations including microwaves that we receive today from the big bang, and from other observations such as the speed at which galaxies are moving away from each other today and when they might have all been close to each other before they started moving away, etc.) Energy expanded away from the position of the particle in all directions. “space” between those “newly” generated particles expanded faster than the speed of light (only space can travel faster than the speed of light). Stars formed from those highly charged particles (hydrogen -simplest form of atom). As hydrogen existed, so did helium (an isotope of hydrogen). Two helium atoms came together to form beryllium. This atom can only exist for a very short period of time and only under extreme temperatures, and that is why it can only exist temporarily in a star. Another helium comes with a beryllium to form a carbon atom (the only element that we know that is capable of making up a live cell). This process of three helium atoms coming together is called Triple Alpha process. Carbon atoms were and continue to be shot into space after the star explodes in a process known as Supernova. Those atoms then cooled off (slowed down) to become planets. Higher entropy and suitable surroundings (temperature, water and right amount of gases) allowed certain combination of atoms to form single live cells, which evolved over millions of years to what we know today as human beings. All of those are just theories that were postulated using observations of microwave rays that “told us” how the universe expanded and became what it is today, and how long it was evolving for before it reached its state today.

Advanced: By measuring the distances between us and the nearest galaxies and stars, and how fast they are accelerating away from us, we can back track to see how “close” all those objects were going back in time to 13 billion years ago (a number we estimated from observing microwaves that traveled to us from the big bang moment). That is how we came to believe that all those cosmic objects were so close to each other that they were all condensed into a small particle.

Now that we are experts of how the universe came into existence, this only naturally leads us to another logical question (actually, of many logical questions): how is it going to end? When is it going to move out of existence into nothingness again? Physics has no answers for those forms of the same question but there are a few theories. One of which says that as the universe expands, its attraction to the center (where the big bang started) is only getting stronger (like a rubber band, the more you stretch it, the more it wants to go back to its “stable” position). Some believe that the expansion is slowing down until it reaches a point where it comes to a stand-still, then it will all collapse back to its very small and dense particle to non-existence then back. More modern theories claim this is not possible because our observations show that the universe is accelerating in its expansion, rather than slowing down. So, how can the universe just disappear when it is ever expanding and becoming more “complex”? The Grand Design addresses where the universe is going (into non-existence), but it doesn’t address how it will get there. It gets away with saying that we should think of our universe as a bubble forming on the surface of boiling water, expanding then bursting then another one forms, etc. Actually, the book says many universes are born like those bubbles “simultaneously” and some burst sooner than others (thus not last long enough for life to develop) while others last longer (like our universe). This brings up an interesting point. Does that mean there are other universes that co-exist with ours? The answer is yes! Confused yet? Let me try harder to confuse you. Those universes do not “really” exist simultaneously. They are all possible paths our own universe could and has already taken. Let’s first address the “could have taken”, then we can address the “has already taken part”. Think of it this way. If you were on top of a mountain, and decided to throw a rock in one direction with certain force. That rock will end up somewhere in the valley. Now, think of how many possible combination of directions AND forces this rock could have been thrown with. All those are called possible paths that depend on initial state and certain conditions (force and its direction). The same with our universe. It could have taken any of many paths (we will save the other part of the universe having had taken all possible paths to when we mention the Feynman sum over histories later). Precisely, it could have taken any of 10 ^ 500 possible paths! That is 1 with 500 zeros to the right! Now, The Grand Design uses Feynman’s sum over histories principle to state that our universe has taken the “most natural path” which has the highest probability. This is nothing but the average of all possible paths it could have taken. The Grand Design goes on to state that our universe is actually the most stable of all, that is why it has the highest probability (a stable state). Now, what is the difference between our universe ad other universes in a multiverse system? Nothing but the number of dimensions and curvature of space-time. Tighten your seat belts even more.

— To be continued here

The Grand Design (Part I)

December 28, 2010

I have always loved theoretical physics. I have been wanting to grab a physics book and read about any new theory or principle (chaos theory, string theory, M-theory, etc.), especially in quantum physics (the physics of small matter). New principles and theories started to accumulate and I felt like I was getting “behind” and I needed to catch up. But, a person in my position at a young startup finds it hard to allocate those precious moments of reading and researching. And coming from a computer science background, I have to do more reading and understanding than any physicist (and forget about the complexity of math that is associated with it). I came across Hawkin’s latest book “The Grand Design”, and wanted to read it. My fascination in physics was revived and I made a promise to myself that I will not just read the book, but I will also read about all the theories that it references until I have a good understanding of the whole concept proposed. Now that my laptop went crazy, I had to send it back to Dell to diagnose, and I found myself sitting here for a few days without coding. This was the perfect opportunity to finally get the book and start my quest. This post is not meant to be for scientific research. This is just my recollection of all the things that I read in the past few days from various books (I had to go back to read about Einstein’s general relativity to understand space curvature for example). I wanted to make this post as simple as possible, but sometimes it is hard to escape the “big words” that may confuse the average Joe. So, I decided to keep the post simple, but if I wanted to reference the big words, I would include a side note that starts with “Advanced”. You may skip those side notes if you are not fairly comfortable reading them and you won’t lose out.

The Grand Design is another physics book that was set out to explain the laws of the universe, to edge us closer to finding the Holy Grail of physics, the equation of everything, which is famously believed to be so simple and short that it would fit on a tshirt (and that is slowly changing to believing that maybe it is not just one equation but a few of them). Or, at least that is what physicists hope it would be. There are so many things to cover ahead of describing what this book is about, and so many things to say afterward. However, I am writing a post, not a scientific paper or a book like I said earlier. You will still have to do the further reading yourself to validate what I am saying or to learn more in detail.

In a nut shell, Hawkins (if you have never heard of him before then maybe you should stick to my other posts instead) states in his book that the universe did not need a helping hand from a divine “someone” in order to start rolling (or expanding). The biggest argument against physicists who are proponents of the new physics (string theory, general relativity, the big bang, etc.) was well if everything came from an extremely dense particle which exploded and expanded over a very short “period of time”, to form the universe, where did this particle come from? A very valid question given that physics itself firmly states that nothing comes from nothing (well until string theory came about, and I will describe it briefly in this post, where this line changed to: something can actually come from nothing provided that its total energy continues to be nothing). Then physicists replied and said: just because we don’t know the answer yet, that doesn’t mean we should give up trying and hand the credit to a divine “something”. Plus, if there was a God, who created him? And so on and so on. This debate was going on for thousands of years. With the new set of string theory variations (collectively referred to as M-theory), physicists now argue that they found the answer. Everything came from nothing and is going back to nothing! Very interesting considering this violates the first law of thermodynamics. Or, does it? How can energy come into “being” without coming from somewhere? The book took a bottom-top approach, by addressing all theories and laws that eventually contributed to the Grand statement toward the end of the book. I will do the exact opposite. I will make the statement, then dive into the corresponding explanation. If you have hung with me so far, put on your seat belt. We are going to address this right about…now.

The simplest possible term this can be put in is: “0 energy” is what this universe has in total! Even as we exist today, there is positive energy such as when work is being done on particles or objects, and there is negative energy when the particle does work and thus radiates energy away. The sum of those quantities across the universe is 0 as all positive energy cancels out the negative energy. The positive energy used to create matter as the universe started to expand was offset by the gravitational energy that wants to collapse the system back into a black hole.

Advanced: As for the matter itself (since mass and energy are interchangeable), no one knows what would cancel it out. That is where supersymmetry or super gravity theories (M-theory) came to be accepted because they state that the negative energy that is present in vacuum cancels out the positive energy of matter (although the math cannot be worked out yet with all the infinities, which is why M-theory is just a theory although with a lot of potential if proven to be finite).

Now, I am not going to dive into another big and controversial area which is “space” and “vacuum”. What is space made up of? There are many theories such as those that say space is occupied with small amounts of “virtual” particle pairs whose total sum of energy is 0 (this is what the M-theory proposes by suggesting that matter and force are two of the same, and thus there must exist something to cancel out the energy in all the matter that occupies the universe and its space). Those pairs include one particle and one antiparticle (of opposite charge) that simultaneously exist and then collide and disappear again (while radiating gamma rays). In summary, everything in the universe goes from mass to energy, back to mass, etc. And that we have equal amounts of positive energy as there is negative energy, totaling to 0.

— To be continued here

IT Forecast: Cloudy!

December 20, 2010

Cloud computing is picking up steam (and moist). I already covered the essentials of cloud computing in two separate posts (start here). Like I did mention in my earlier posts, it is not a new technology or anything innovative. Cloud computing concepts were employed in academia and research for many years. It was used mainly to make the most out of commodity computers. The enterprise started to pay attention since the start of the post-Y2K, money-saving, lean-seeking era. I attribute the reason for that speedy adoption to three main factors:

1. Increasing cost of ever-expanding data centers, and purchasing new servers and cooling them.
2. Advancement of virtualization and associated management tools (VMware, Citrix, Microsoft, etc.)
3. Amazon’s EC2 (around 2002).

During the economic slow down this past decade, businesses looked for opportunities to reduce hardware, cooling and energy, as well as data center and their maintenance costs. Hardware manufacturers poured hundreds of millions of dollars to invest in greener and more power-efficient machines. It is evident from the newer and newer chip generations that a newer generation chip by AMD or Intel means more computing power and less power consumption. That wasn’t nearly enough in reducing costs. Companies still had to deal with ever increasing data center sizes. They moved to consolidate servers, which proved to be harder than they had anticipated as legacy applications weren’t as easy to migrate to newer servers with less available resources. Advancement in virtualization technology and its associated management tools allowed IT to consolidate servers and compact applications into smaller physical machines (still using those powerful machines, but more efficiently now) as CPU utilization increased from under 15% to much higher rates. This allowed a major reduction in size by a factor of 10 on average.

Companies jumped on the virtualization bandwagon as a way to reduce cost per physical machine, and lower power consumption by those idle servers. Virtualization also helped contain rapidly ballooning data centers. But that is not enough. Companies still have to buy expensive servers, maintain them, manage the overhead (IT staff overhead as well as physical management overhead), etc. That, along with the initial major push by Amazon that got the cloud computing engine started in the enterprise, allowed businesses to take advantage of a new face of old technology. With cloud computing, companies are able to outsource major parts of their data centers to an outside cloud service provider. They save IT overhead (cloud service providers provide their own staff), software management (security patches, upgrades, updates, etc.), hardware management (allocating physical spaces, cabling the server, rack space, etc.), and hardware cost, as companies did not need to over stack their data centers in order to manage future spikes by client requests, to only go back to a normal request cycle by their clients after those spikes.

With cloud computing, companies can concentrate on their own core business without consuming their time and effort managing this overhead that does not contribute to company’s IP or bottom line. Businesses can take advantage of a hybrid cloud to outsource their scalable-hungry applications, while keeping in-house those elephant applications (that do not change much and do not require run-time scalability). With cloud computing, companies do not need to buy powerful servers. Gone are the days when servers kept getting more and more powerful. With cloud computing, commodity computers are kings. I expect many of server manufacturers to shift their resources to building either internal (private) cloud ready servers (that replace software solutions for building and maintaining cloud-ready infrastructure) or have features and properties to allow them to plug into an existing cloud servers rack (servers with much functionality stripped out to the bare-minimum to allow lightweight-like computing machine that is green and cost-effective). The reason why cloud computing mark the beginning of the end of high-end servers is because as companies move their infrastructure to the cloud, cloud service providers will realize that to stay competitive in the per-hour resource renting space, they will need to lower their per physical server cost. To be able to do that, they will need to utilize virtualization (to maximize income per physical server), and lower physical server cost and its associated power costs. To be able to lower the cost of those servers and their power-consumption, cloud service providers will use commodity computers that are cheaper and require less energy. In the world of unlimited CPU and memory resource pools, there is no need to buy this expensive 64 GB server anymore that costs a lot more than a pool of commodity computers with the same total size of memory. Furthermore, commodity computers are stripped down to the bare-minimum features that they do not require much software management or overhead resource burning. That is why Google is building their own commodity computers instead of buying them.

Businesses will continue to outsource their infrastructure, platforms and applications to the cloud as they realize that they would become more productive if they focus on their core business functions rather than all the bells and whistles that are needed to make that happen. And as businesses outsource this overhead to a company dedicated to manage this overhead for a lot more manageable cost, IT administrators will see their jobs decaying away. IT administrators will have to find other things to do outside of their normal range of functions. They will have to acquire a new set of skills, probably in the development field as they notice the shift in IT management power from their hands to the end-user. With cloud computing, the promise of simplifying IT management is stronger than ever. An average user will be able to log on to the cloud service provider and manage their own application and infrastructure using user-friendly management pages without needing to have any prior technical background.

Cloud computing is not a revolution, but its adoption will be this coming decade. As mobile devices and notebooks market grow in size, client devices are becoming thinner and thinner, while the applications are getting richer and richer. This is only possible with hosted services that are ever-scalable, available, and fail-over ready. Those are just a few of the promises that the cloud provides. Consumers will go after smaller and cheaper devices and terminals as there would be no need for powerful laptops and desktops anymore. If I can afford to buy 5 or more dumb, thin, and very small terminals and distribute them around the house, and then buy monitors and attach them to those terminals, then I can use a VDI solution, hosted on a popular cloud service provider to load my desktop (along with my session) on any of the terminals in my house! My remotely hosted applications will be running on the supercomputer-like grid of commodity computers with all the resources they need. I can create custom-made desktop VMs for my children with high levels of control. They can destroy their VM and I can get a new one from the host service in no time! No slow computers, and no dropped and broken computers. No 10 wires per machine (just one for the terminal). This is going to be the new generation of personal computing within the next few years. I may be able to hook a 17 inch LCD to my iPhone, and be able to see my VM hosted on RackSpace.com on the LCD as if it was connected to a very powerful desktop!! How about eliminating the need for an LCD and using a projector screen? Maybe my smartphone will allow for such a project, which will allow me to take a very powerful computer wherever I go without losing speed, sacrificing battery power or even giving up screen size!

No one will benefit more from cloud computing than government, small budget businesses and non-for-profit organizations. Buying small and thin terminals (perhaps the size of one’s palm, or even a finger-size terminal) and investing much of the money on the data center (private clouds) or purchasing more and more services by the public cloud, would allow for less cost. No more worrying about back up, scalability, compliance, licensing, clustering, fail-over, high availability, bandwidth, memory, replication, disaster recovery, security, software upgrades, re-location, etc. Those are all given as promises outlined in detail in a service line agreement (SLA). Even better, those institutions and businesses will be able to deliver the same consistent service across campuses and locations.

Additionally, interoperability and integration (standards are being laid out, but will hopefully solidify and become industry-wide accepted standards within a few years) will allow companies to utilize new software and applications with a switch of a few check box selections from the cloud management page for their data center. A company can switch from using SQL Server 2008 to MySQL Enterprise with a check box selection. Users can switch their email clients from one site to another, etc. Even beyond that, a consumer can switch a whole platform from Windows 7 to Linux Ubuntu and back in a few seconds. Platforms and applications become tools and roads rather than destinations. This is only good for consumers because the ease of transitioning in and out of platforms and applications will allow for opportunity as well as caution and fear of losing customers for all businesses alike. This will result in a booming period for open source software (difficult installation and set up processes kept most open source software from the public hand), as management becomes transparent and standard.

The next decade will allow for some exciting opportunities to unfold as businesses start sprinting in a fast-pace race, after going a long decade of dieting (they became leaner) and adopting new technologies that will allow them to concentrate on their core business rather than all the extra fat (overhead).

I will write another post that will be dedicated to talk about some of the available cloud services and applications that people and small business can use immediately, in order to manage their start up or maintenance cost, without falling behind to competition that uses better software and services.

It is hard to forecast what will come next, but one thing for sure, it will definitely be cloudy!

Cloud Computing Simplified (Part II)

December 14, 2010

This is a continuation of my last post (Cloud Computing Simplified (Part I)).

I was still standing there in front of my friend. I got many questions answered and well-articulated inside my head. But, just like taking a test, thoughts are useless if the oval corresponding to the right answer on the answer sheet is not filled. Her original question was regarding her company’s website, and whether it was a SaaS or cloud. Before I enter what it appears to be a brain-trap, I need to define what a SaaS is. After all, I used to tell my students that if they could not explain any highly technical concept to their grandmas, then it is an indication that they don’t understand it themselves. I am standing here in front of a younger version of my grandma, but the statement is the same. The easy explanation of SaaS is simply “Software as a Service”. Great. Case closed. Not really. What the hell is a software? And what the hell is a service? And what does it meant to provide software as a service? We know what software is. It is any application that you use day to day. Service means that you have a black box sitting somewhere else (cloud?) that takes an input from a user and produces an answer. Think of something like a calculator. It takes numbers, and operators and produces the answer. There are a few characteristics of services. First, it has to be stateless (it does not remember you the second time you call it, and it doesn’t discriminate against who is calling it). Secondly, it has to produce one answer, which is consistent (no matter how many times and what time of the day you call the same service with the same arguments, you should always end up with the same answer – unless you are calling the getCurrentTime service :)). The last property of the service is that it is reachable via TCP/IP calls (not via mailed-letter using the post office).

So, providing the application or software as a service that is accessible from anywhere (since the web runs on a wrapper protocol (HTTP) above TCP/IP protocol), by anyone (no discrimination principle), and always consistent, is what SaaS is all about. The fact that you provide that software as a service that is callable by anyone from anywhere, then you can also create code that calls this software from another piece of software! This provides for a powerful concept on the cloud where applications are interoperable using proprietary-free calls. This allows for not only scalability across regions on the Internet but also interoperability among different cloud service providers (such as Microsoft, Amazon and RackSpace)!

The same definition of SaaS applies to all the other ?aaS resources. IaaS is nothing but offering hardware/infrastructure resources via TCP/IP API calls (S3 by Amazon is a leader in this space). PaaS is offering enabling platforms (such as Force.com and Google Apps Engine) which allow developers to create SaaS on top of those platforms on the cloud (such as SalesForce.com and Google Apps). You have other ?aaS such as DaaS (which stands for data as a service if you are talking to a data provider, or desktop as a service if you are talking to Citrix, Cisco and VMWare), etc.

The reason why ?aaS are provided via publicly-accessible API is to provide the power of management to the end user (as opposed to the IT administrator that has been hogging the power over computing resources for a long time). Those public APIs allowed many third party companies created pluggable easy-to-use management tools to manipulate those ?aaS resources on the fly. Yes, that is right, the cloud changes the concept of the Internet (or evolves it?) from a relationship where the end user is helpless recipient of what is given to him, to a more empowered user that has the ability to interact with the website and public service albeit at a minimal level (via Web 2.0 and ajax-driven applications) to being completely in control where he not only manipulates what the service has to say, but also how much resources are available to it, etc. via a user-friendly interface to the cloud service provider’s data center. Much like Turbotax, user-friendly interfaces to cloud service provider’s websites empower average cyber-joes to design their own data center and their public applications with a 16 digit magic code (a.k.a. credit card number). Although I think those “user-friendly” interfaces have a long way before they become friendly, but what we have today is a good start.

That is it, I made my friend wait too long, and if I don’t let the words out soon, I may as well switch the topic and improvise a weather joke that includes the word “cloud” in it. She did ask whether her website is a SaaS. Let’s examine this for a second. Is a website a software? That seems much harder to answer than it seems. Wikipedia defines software as “… a general term primarily used for digitally stored data such as computer programs and other kinds of information read and written by computers.” This implies that websites are considered software. This is not a typical thing from a developer’s perspective as websites seem to be just an outcome or result of a software. I actually agree with this intuition and declare that websites are results of software and not software by themselves. Just like this blog is not considered software but a few paragraphs of text. So, since a website is not software, it cannot be SaaS. However, the code that produces the website is software and is running on a server somewhere. However, although it is running on a host somewhere, it (the software itself) is not available as a service. No one interacts with this software itself. Users interact with the results of the software (i.e. website). Users cannot create code that plugs into the server side code of the web site to modify it or use it. Thus, it is not SaaS. But wait a minute, does that mean that gmail is not a SaaS? Not entirely true. It is SaaS because although it has a relatively constant interface that users cannot modify 100%, it is provided publicly as a set of APIs that can be plugged into by third party libraries to not only change the interface of gmail for particular users completely, but also extract email and most of gmail’s functionality and integrate those into a third-party application. That is something you cannot do with a third-party website.

So, finally I got to the first question and I have an answer. No, a website is not considered SaaS. Phew! That only took about two blog entries to answer! What about a website being The cloud? No, a website is not the cloud. However, it can be running on a cloud (has all the benefits of infinite scalability, FA, HA, clustering, etc.) But wait a minute. A website is not a SaaS, but it can be run on the cloud? How is that possible? Well, did you ever (as a developer) write non-OOP (object oriented program) code using an OOP language? What about forcing SOAP to manage sessions? You get the point. It is not ideal to have an application running on the cloud that is not highly service-oriented. If you end up running such a badly-designed application then you will end up not utilizing the full power of the cloud, and instead you would be using it as a simple host provider for your site. Which is what many companies do after all. The reason I say that you will not realize the full benefit of being on the cloud using a non-service oriented application (such as a website), is because if your web site has poor session management (non-distributed), then scaling it out is going to be really hard (how will you make sure when the new request goes to another instance that it will have access to the original session started on another instance? Can a user withdraw money ten times just because she hit ten different instances that are out of sync with each other?) Remember, the cloud is not something magical (outside of the context of Care Bears and their magic clouds (Care-a-Lot)). So, you must make sure that your application is SaaS ready before you can realize all the benefit of the cloud. Otherwise, you are deploying your application to one physical server on the cloud and will never realize scalability or clustering powers that are available to you for a small fee. So, a website can run on the cloud, and will utilize scalability and clustering capabilities of the cloud if and only if it is designed in such a way that makes it service-oriented. You can still have session management, but that needs to be designed to be distributed and managed outside of the services that your website server side code uses. This way, you can scale out the components that are stateless, and replicate the stateful cache.

So, in summary, the cloud continues to have the same concept it had before (as a UML-like component on a blackboard) by continuing to be the important but irrelevant part of the system. And, coming a full circle, I open my mouth, and this time I am confident that I have covered all the possible paths a curve-ball may take. I told her that her company’s website can actually be run on the cloud. In this case, it would be her company’s host provider. However, the cloud is not a one-technology thing, it is a mixture of solutions and resources. And, depending on what her company requires the web site do/offer, and the guarantees they provide to their customers about the website availability and up-time, you may have to review which cloud service provider offers the best contract (SLA – service license agreement), and she may need to check with her developers that the web site is cloud-ready and will leverage full advantage of those guarantees. She followed up with a question: “I see, so what is SaaS?” Oh man! Sometimes I wish I can copy and paste my thoughts into speech so I won’t have to repeat what I said in this blog to her again. But, it may actually be good that I will talk about it again, because this time I will have to describe what SaaS is in even simpler terms. So, I said: Well, if your web site offers a service that your customers use on the website, and would also like them to be able to use that same service on the iPhone, iPad, Droid, etc. as well as offer that exact same service to THEIR customers on their web site, then you have to provide that one service as a SaaS. Which means that your developers have to know how to design it in such a way to enable it to be detachable from your web site and available to be called and used by other third-party (your customers’) applications. Your website will use that service as well. It is good to design all those usable services as such when you go to the cloud because it will enable you to take advantage of what the cloud provides such as scalability (you can handle spikes in user requests).

She seemed to follow my statements and logic. So, maybe I did pass the “explain-it-to-your-grandma” test! Although I may have hurt my chances of having better luck having her call me back after I give her my business card so I can provide consultation on how to design such as a cloud-ready application, since my answers seemed to draw a pretty and easy picture of this whole complex cloud concept! The cloud can offer pretty much anything anybody needs on the web (because of its “potential” to be proprietary-free, interoperable, and pluggable), except the opportunity to take back what you just said to someone else 🙂

Cloud Computing Simplified (Part I)

December 14, 2010

Gartner projected that cloud computing will continue to pick up momentum next year as it cements its position as one of the top four technologies that companies would invest in. Earlier this past November, Gartner projected that cloud computing would (continue to) be a revolution much like e-business was, rather than just being a new technology added to the stack of IT arsenal.

What is cloud computing? A tech-challenged friend of mine asked one time. As I took a deep breathe before I let out the loads of knowledge that I have accumulated reading about, teaching about and working with cloud applications in its various layers, I froze for a second as I did not know how to explain it in layman’s terms. Of course if I were talking to an IT person I would get away with talking about the mouthful acronyms of ?aaS (where ? is any letter in the alphabet). However, when the person opens the discussion asking “Is my company’s website a SaaS?”, “Is my company’s website a cloud? The cloud? Running in a cloud?”, it hit me maybe there was a reason why every book I pick up or article I come across that talks about the cloud, includes a first paragraph that always provides a disclaimer that goes like this “The first thing about cloud computing is that it has many definitions, and no one can define it precisely, but we will give it a try in this book”. Maybe that is how I should start every attempt to define the cloud to someone else?

Before I answered my friend’s question, I phased out for a little bit as a big blackboard suddenly appeared in front of me with humongous set of UML diagrams (they were called UML but they included every possible shape and a verbal disclaimer for getting away from the standard diagrams by the architect drawing on the board). Those diagrams were so big and complex that people moved away from typing notes to taking pictures of the board. Including “high resolution” as an important requirement of a smart phone  before purchasing it, became an important differentiator between competitor phones (the iPhone won the war for me as it has the best resolution as of the time of this typing). High resolution was important because you see everyone in the room shouldering each other at the end of the meeting as they take a picture of the complex diagram using their phone. None of that stuck out in my mind, as I was phasing out, as much as the big cloud diagram drawn to the side of the big complex picture. That component was always known by everyone as the Cloud, and it always meant “everything else”, “something else”, “anything”, “everything”, “something”, “I have no freaking clue what there is”, etc. It was the piece that no one cared about, it was just sitting there encompassing a big piece of the entire system, yet, it was of less importance (apparently) as everything else on the board, thus earning its “cloudy” scribble to the side of the board. OK, so the Cloud is something useless that no one pays attention to? That may be a cop-out if my friend hears those words out of my mouth after a pause.

All of a sudden, something else happened. As I was dazing off, I started thinking to myself: Well, wait a minute. Did we change our definition of that little useless cloud drawn on the board? Or is it really useless? To be able to answer that question, I had to understand why the hell we resorted to drawing “everything else”, “something else”, etc. as an isolated cloud. Maybe it wasn’t useless. Maybe it was extremely important but irrelevant. Which is a big difference. Sometimes it is good to have a big chunk of your system abstracted away in its own little cloud shape being irrelevant to any other changes you make to your system. After all, if every change you make affects everything else in the system including your infrastructure, there is a big problem in your system’s design to begin with. So, maybe it was a good thing that architects had a big cloud sitting off to the side. As a matter of fact, the bigger that cloud that is not affected and does not affect your changes to the rest of the system the better! OK, so now I have solved one mystery. The cloud is the part of your system (infrastructure, platform, and software) that is abstracted from you because your system was designed correctly in such a way to have its component independent of each other. This way, you can ultimately concentrate on the business problem at hand, rather than all the overhead that is there just as an enabler rather than being fundamental to the business at its core.

But wait! If architects have been defining clouds on their boards for such a long period of time, what is so new about it? The short answer is: nothing is new about the cloud! The cloud is not something that just popped up. It has been around for a long time! From clusters of mainframes in the 60s and 70s, to GRID computing, the cloud has been there for a long time. The reason why it was picked up by the professional world just recently was due to the giant improvements in virtualization management tools that allowed the customer to easily manage the complexity of clouds. OK, I am getting somewhere, but I still haven’t gotten to my friend’s questions. Maybe I should continue to go deeper to find more answers first before I unload my knowledge, or lack-thereof! Going back to the board. Was every cloud really a cloud? After all, sometimes an architect draws a cloud around a piece or a component that she doesn’t understand. So, there has to be a difference between the cloud that a knowledgeable architect draws versus someone else that had it mistaken for another UML diagram component.  So, what defines a cloud? Well, it is an abstracted part of the system. It may encompass infrastructure, platform and/or software pieces.

That is the easy part. Here comes the more technical part. The cloud provides infrastructure, platform and software on an as-needed basis (service). It gives you more resources when you need them, AND it takes away extra resources you are not using (yes, the cloud is not for the selfish of us). To be able to call it a form of Utility Computing (where you pay only for the resources you use), an additional cost factor is associated with the resources you use on an hourly basis (or resource basis). Yes, if it is free, then it is not the cloud. No place for socialists on the cloud. We will skip this argument of paid versus free because you will tell me that many cloud-based applications are available for free, and I will tell you that nothing is free because those same applications provide a free version (which is just a honeypot for businesses) and the paid version (which is where businesses need to end up just to match the capabilities of their in-house software). The cloud may be free for individuals like me and you who don’t care about up-time and reliability, but I am focusing my discussion here on businesses (after all, an individual will not have their own data center, or require scalability, etc.)  So, I win the argument, and we move on!

Another condition is for all those resources to be available via TCP/IP based requests (iSCSI or iFC for hardware, web-services API calls for platform and software resources). It is important that requests and responses to cloud’s resources go on the web, otherwise you cannot scale up to other data centers sitting somewhere else. The cloud is scalable (“infinite” resources), DR (disaster recovery) ready, FA (fail-over) ready, and has a very HA (high availability). The last three characteristics are made possible by a few technology solutions that sit on top of the cloud such as virtualization, although virtualization is not a necessary component in a cloud due to the use of commodity computers factor which is to be discussed below. With virtualization management tools, DR – for applications only, FA and HA are provided out of the box,  DR for databases is made possible by replication (and acceleration and de-duplication techniques sped up the process across physical LANs) tools. Scalability is provided by the SOA (service oriented architecture) characteristics of the cloud of its independent and stateless services. Another component that is essential to defining what a cloud is, is the use of commodity computers. It is essential because single CPU power is not relevant anymore as long as  you have an infinite pool of your average joe-CPU. Google builds its own commodity computers in-house (although no one knows the configurations of such computers), and it is believed that Google has up to 1,000,000 commodity computers powering its giant cloud. If the cloud service provider is using powerful servers instead of commodity computers, that is an indication that they don’t have enough computers to keep their promise of scaling up for you. The reason why both properties cannot co-exist is because if the provider can have “unlimited” powerful servers, that implies that to offset their cost (especially cooling) they would have to provide the service to you for extremely high prices, offsetting the benefit of moving to the cloud versus having your own data center. Many people also believe the cloud to be public (meaning that your application will be running as a virtual machine right next to your competitor’s on the same physical machine with a low probability – albeit bigger than 0%).

Many companies choose to run their own “cloud” data center (private), which theoretically violates a few of the concepts we defined above (cost per usage, unlimited scalability). That is why some don’t consider private clouds to be clouds at all. However, to their dislikes, private clouds have dominated the early market as companies have bought into the cloud concept but they still fear that it uses too many new technologies, which makes it susceptible to early security breaches and problems. Amazon had a major outage in the East region in Virginia toward the end of 2009 (which violates the guarantee that your services are always available as they are replicated across separate availability regions). So, we know what private clouds (internal data centers designed with all the properties of a cloud minus the cost per usage and unlimited resources), and what public clouds are (gmail anyone?). What if you want a combination of both? Sure, no problem, says the cloud. You have the hybrid cloud (using a combination of private clouds for your internal, business critical applications that do not require real time scalability to public services on public clouds where you push your applications that require the highest availability and scalability (Take Walmart’s web site around the holidays for example). This is going to be the future of the cloud as there are always going to be applications that do not need the power of the cloud such as email (does not need to scale infinitely in real time), HR, financial applications, etc. There is a fourth kind which is a virtual cloud. This is having the advantage of both worlds (public resources but private physical data centers). You have access to a public cloud, but your own secure isolated vLAN that you can access via VPN over HTTP (IPSec). The virtual cloud will guarantee that no other company’s applications will be run on the same physical hardware as your company’s. Your internal applications will connect to your other services sitting on a public cloud via secure channels and dedicated infrastructure (on the public cloud).

If you are confused about how applications (for various companies) can be running on the same physical machine and why in hell you would want to do that, check out my two series articles about virtualization.

— To be continued here

Virtualization – Under the Hood (Part II)

November 19, 2010

This is a continuation of my last post (Virtualization – Under the Hood (Part I)).

Q: Can you actually have more memory allocated than available physical memory? And how?

Short Answer: Yes. Through  many techniques including: Transparent page sharing, page reclamation, balloon driver, etc.

Long Answer: You can actually start many VMs with total allocated memory that is more than the physical memory available on the server because not all applications will utilize 100% of their requested memory at all times. Page reclamation allows the hypervisor to reclaim unused (previously committed) pages from one VM and give it to another. Another technique a hypervisor may use is to allow VMs to share memory without them knowing it! Sounds scary, but nonetheless manageable by the hypervisor. This allows more VMs to start with their full requirements of allocated memory met, although they may be sharing memory pages with other VMs. Lastly, there is the approach of ballooning memory out of a VM. This is more of a respect driven approach by the hypervisor where it requests memory from all executing VMs, and they would voluntarily balloon out all the memory pages they are not using. Once they need the memory back, the hypervisor will send it back after obtaining it from other VMs using any of the methods above. Swapping pages with the hypervisor is an expensive operation. That is why you should always start your VM with a pre-set Reservation amount (minimum amount of memory guaranteed to the VM by the hypervisor). However, the more you reserve upon start up of your VM, the less VMs can be fired up on that same physical host.

Q: How do you make sure your application is highly available?

Short Answer: It depends on the approach, and the virtualization suite you are using. You either take things into your own hand and cluster your application over 2+ VMs, and make sure you replicate necessary data over to the redundant VM(s), or use the tools provided by virtualization suite to move the VM to a less utilized or crowded host.

Long Answer: High availability of VMs can be jeopordized in one of two ways:

1. Either your VM is running on a highly utilized host, making your applications less responsive. In this approach you can utilize the virtualization suite to transfer or migrate your running VM to another host that is less crowded. vSphere provides vMotion, which is used by their HA appliance to migrate your VM to another host without taking the VM down! They actually start copying your VM byte by byte starting with the section of your memory that is not or under utilized at the moment, while keeping track of all “dirtied” pages since the last transfer to re-transfer again to keep it consistent on the new host. At some point the hypervisor of the first machine will turn off the VM, while the other hypervisor on the target machine turns it on simultaneously. Microsoft added Live Migration to their R2 release of HyperV to do just that. There are many metrics and thresholds that can be configured to trigger such an action. Dynamic Resource Scheduling (DRS) in vSphere allows you to set those parameters and move away from the cluster, DRS will manage to move your VM from one host to another to ensure highest availability and accessibility.

2. When a host goes down, another VM needs to fire to start taking requests. This can be done using virtualization suite tools (only when data replication is not required). However, when you need to cluster your data as well then you will need to introduce data replication yourself such as Microsoft SQL Server clustering. This will allow the new VM to immediately serve requests as soon as the first VM goes down. Of course there will need to be some sort of switch control at the virtualization suite management level or using an external context switch appliance such as NetScaler.

Q: Is virtualization the equivalent of parallel processing on multiple processors?

Short Answer: Yes and no.

Long Answer: Virtualization introduces the capability of borrowing CPU cycles from all CPUs physically available on the host. The side-effect of this is introducing the effect of parallel processing. However, the only reason a hypervisor would want to borrow cycles from another CPU is because the first CPU it had access to is fully utilized. So, technically, you are not really parallelizing to run things in parallel, but rather to use as much of the CPU cycles as your application needs to run its single- or multi-threaded code.

Q: Since we are running multiple VMs on the same host, doesn’t that mean we share the same LAN?? Wouldn’t that be a security threat if one application on one VM was missconfigured to access an IP of a service on another VM?

Short Answer: Yes. But we don’t have to share the same network even among VMs on the same host.

Long Answer: You can create virtual LANs even between VMs running on the same physical host. You can even use firewalls between VMs running on the same host. This way you can create DMZs that keep your applications (within your VM) safe no matter which VMs are running on the same host.

Q: Since a hypervisor emulates hardware, does that mean that my guest operating systems are portable, EVEN among different hardware architectures?

Short Answer: Yes.

Long Answer: Most virtualization suites support x86 architectures because they are “better” designed to take advantage of virtualization. It also depends on the virtualization suite you are using, and what guest OS it supports (for example vSphere does not support AIX). Additionally, although in theory those guest OS and their hosted applications are portable, it also depends on the hypervisor’s own implementation of drivers on the system. The hypervisor code does not use the drivers installed inside the guest OS, but its own set of drivers. The implementation could vary from one system to another, one device to another. So, you definitely may end up with different behaviors or performance on different hardware even using the same VM.

Note: The Open Virtualization Format (OVF) is a standard format to save VM in so you can migrate them to another virtualization suite (not just hardware!) However, not many virtualization tools support this format yet.

Q: What about security? Who controls access to VMs?

Short Answer: Virtualization suites provide user management. This list is separate from application users.

Long Answer: There are many layers of user roles and permission management in a virtualization suite, depending on the suite itself. Typically, you can create users, define their role, their access to VM, and what type of permission they get. You can even create pools of VMs and apply the same set of user role/permission combination. This eliminates having to manage security and authentication on each individual hypervisor, and instead, do the management across a host of them.

Q: Ok, ok, ok. How about the dark side of virtualization?

Short Answer: There are many problems or potential problems with virtualization. It is not all roses.

Long Answer: There could be many reasons why not to use virtualization including:

1. With virtualization, now you are collapsing the system administration and networking team (and possibly security as well) into one team. Most (if not all) virtualization suites do not provide various roles of managing the virtualized datacenter based on those divisions. Once you have an administrator access to managing the virtualized datacenter, all divisions are off at that point. This can be seen as a good thing. However, it is mostly a bad thing because a system administrator is not necessarily a person that is highly specialized in dissecting the network among all the various applications based on the requirements and capabilities of the enterprise.

2. Upgrading or updating one application or VM requires a lot more knowledge in its requirements and potential effects on other VMs on the same host. For example, if an application doubles its memory requirements, the IT administrator managing the virtual environment must know even if the increase in requirement comes on a host that has that enough physical memory. In a traditional environment, as long as the physical memory is available, the IT administrator deploying the updates or upgrades does not necessarily need to know of the new memory requirements of the application as long as no additional physical memory needs to be attached to the server. This change forces administrators of the virtual environment to be more aware and knowledgeable of the applications and VMs running in their system, which is not a hard-line requirement in traditional systems.

3. If the number of VMs fall under 10 or so per host, then you maybe adding more overhead than realizing the benefits of virtualizing your machines.

4. Testing and debugging system events are a lot more involved now as an administrator has to chase the VM wherever it goes in order to check the trail of events across those machines, plus look at the guest OS even logs to complete the picture before understanding the problem.

5. Created VMs will require physical space as well (for the VM files themselves). This is an overhead, and if not managed correctly you may end up bursting your storage capacity bubble with over-creating VMs.

6. Miscellaneous: expensive management tools, new IT skills to be learned, single point of failure (if one host goes down, it will take down many VMs with it), more bandwidth headache if one physical host starts up (making many VMs initialize while starting up at the same time), etc.