Big data architecture and patterns, Part 1: Introduction to big data classification and architecture

Big data architecture and patterns, Part 2: How to know if a big data solution is right for your organization

Big data architecture and patterns, Part 3: Understanding the architectural layers of a big data solution

I suck at math, And Khan Academy saved me.

As i’m progressing in the Machine Learning course by Andrew Ng, I felt lost with the math of the Regression. Not after too long on google i decided to watch the Regression videos on Khan Academy. It took me less than two hours to put me back on track :-).

Thank to the guy “I couldn’t find your name one the website” who made the regression videos, i owe you a lot :-),

Thanks Mr. Khan, Mr. Ng and thanks to everyone contributing to the free MOOC movement.

RIP Aaron Swartz

AaronSwartzPIPA

He is one of the most inspirational people “to me, at least”, I’m not exaggerating if i said he’s been a superhero of mine since I knew him in early 2012 when the whole SOPA thing came up, He started  http://demandprogress.org, I contacted him, asking how to support the campaign from Egypt.

Aaron has committed suicide yesterday after falling into depression for waiting a 35 years jail sentence for believing knowledge should be free and available for everybody.

I’m not a good writer so I’m referring you to @ggreenwald’s article at the Guardian to know more about him and what he did to the Internet community.

The inspiring heroism of Aaron Swartz

TIL Command and Query Responsibility Segregation

When most people talk about CQRS they are really speaking about applying the CQRS pattern to the object that represents the service boundary of the application. Consider the following pseudo-code service definition.

CustomerService


void MakeCustomerPreferred(CustomerId)
 Customer GetCustomer(CustomerId)
 CustomerSet GetCustomersWithName(Name)
 CustomerSet GetPreferredCustomers()
 void ChangeCustomerLocale(CustomerId, NewLocale)
 void CreateCustomer(Customer)
 void EditCustomerDetails(CustomerDetails)

Applying CQRS on this would result in two services

CustomerWriteService


void MakeCustomerPreferred(CustomerId)
 void ChangeCustomerLocale(CustomerId, NewLocale)
 void CreateCustomer(Customer)
 void EditCustomerDetails(CustomerDetails)

CustomerReadService


Customer GetCustomer(CustomerId)
 CustomerSet GetCustomersWithName(Name)
 CustomerSet GetPreferredCustomers()

That is it. That is the entirety of the CQRS pattern. There is nothing more to it than that… Doesn’t seem nearly as interesting when we explain it this way does it? This separation however enables us to do many interesting things architecturally, the largest is that it forces a break of the mental retardation that because the two use the same data they should also use the same data model.

Source, Wikipeida

Recursion and Tail-recursion

What is recursion?


The concept is general. Stories inside stories, movies inside movies, images inside an images….

Formal definition?
Recursion is function that call itself.

why would a function call itself?
There are LOTS of algorithms for which the simplest and easiest to maintain version is recursive.
Sorting/searching Algorithms, Most data structures.

A factorial number is a simple recursive function,
Its the product of all positive integers less than or equal to n. For example

5! = 5 * 4 * 3 * 2 * 1

First thing to think about is the base case, When to stop?
When we reach 1 …

if Number == 1; then the answer is 1

if Number == 1; then the answer is 1
factorial(Number) = Number * factorial(Number -1)

In a C-style language it would be

factorial(Number)
if x == 1
return Number
else
return Number * factorial(Number - 1)

This function will be executed like this

5! = 5 * factorial(4)
5! = 5 * 4 * factorial(3)
5! = 5 * 4 * 3 * factorial(2)
5! = 5 * 4 * 3 * 2 * factorial(1) #Where it meets the base case
5! = 5 * 4 * 3 * 2 * 1 = 120 … #Congratulations!
Is it a good practice to keep that much of operations in memory?

Do you see the growth rate of this function? It grows as much as there are elements to multiply
2!= 2, 3!= 6, 4!= 24, 5!= 120… 9!= 362880
Dare we calculate 120! ? 🙂 …

– Another concept introduced here is Tail Recursion to transform the above linear process to an iterative one.

In order to eliminate this stacking of operation we want to reduce them,
So we will need to hold an extra temporary variable as a parameter in our function, They call it Accumulator

Accumulator is place to hold our calculation results as they happen.

A Tail recursion version of our function

tail_factorial(Number, Accumulator = 1 )
if Number == 1    #Our base case
return Accumulator
else
return tail_factorial(Number -1, Accumulator * Number)

tail_factorial(4) execution looks like

tail_factorial(4)     = tail_factorial(4, 1)
tail_factorial(4, 1)  = tail_factorial(4-1, 4*1)
tail_factorial(3, 4)  = tail_factorial(3-1, 3*4)
tail_factorial(2, 12) = tail_factorial(2-1, 2*12)
tail_factorial(1, 24) = tail_factorial(1-1, 1*24) #Reach the base case
tail_factorial(0, 24) = 24

Do you see the difference? Now we never need to hold more than two terms in memory!