Predicting the Future with google trends

A recent article mentions various studies which use public data to try and predict, the stock market or the house values tendency. This studies use data available to the public via, google trends or twitter to correlate the public mood which is extracted from twitter logs to housing value. This study seemed to demenstrate that stock values were more dependent to calm and to some extent happyness could have an impact on the stock tendancy to increase or decrease.

The second study uses google trends queries logs on housing to predict if houses are going to be sold or not. This follows another project Google setup last year where it used it’s search queries to predict the flu-trends after the H1N1 outburst.

Recently google also added a number of new datasets available to the public via it’s public data tool recently, which gives us a really interr

If correlation on these datasets was made succefully enough to predict house market value and probably stock exchange. Since, if anonymous mood information can give us tendancies of the market. This is nothing compared to the actual information which can be extracted from social networks such as facebook or google history, since by including the relation between people, geolocation and the knowledge or the information people actually know. We will not only be able to predict but actually know what people are planning to do, learn or discover.

Historically we have always seen information going in one direction. Going from Event to press to public. But if we actually know who has read what article, then we know exactly what they know and therefore, based on that information what choices they are capable of making. The control of the information in this case is complete.

Today was also the day wikileaks published a large amount of information on the Iraq war. Including this very interresting casualties map. Many people had predicted that this would happen in Iraq, but maybe that if we had had the tools to prove it. We could have avoided a huge amount of casualties.

Basic Linux administration

Just a quick Reference for linux dedicated server administration

Basic Administration:

I got out of the initial release nearly as soon as I started using the server.
And have had a hard time working a lot of things out, so I just thought I share a few basic administration technics. Maybe, this will just be for myself. Maybe it will be of use for someone, one day.

Basic comands:
See what is going on with the machine:

Processes:
>> top
>> ps -aux

Space left on disks
>> df -h
-h is the option, which you tend to use to see the memory in human readable format
>> du -h –max-depth=1

you can move some folder from your /system/ partition to an other with more space by replacing the folder by a symbolic link. I am not sure this is best pratice, but if you run out of space, then this can be a life saving task.

Exemple:
>> mv /var/log /home/system/var/log ; ln -s /home/system/var/log /var/log

Virtual Memory
>> free

Server information
>> uname -a
>> hostname
>> ifconfig

Data manipulation

>> history | less

Finding data

>> which php
>> whereis php

Users and privileges:

>> chpwd
>> chown
>> chmod
>> chgrp

File Content:

>> cat filename | less
>> head filename
>> tail filename
>> vim filename

>> echo “whaterver you want to add” >> filename

Putting text in a file
>> history > filename.txt

Database:

>> mysql -h localhost -u username -p database_name < filename.sql
>> mysqldump -h localhsot -u username -p database_name –add-drop-table > filename.sql

Usefull queries:

database sizes:

SELECT table_schema "Data Base Name", sum( data_length + index_length) / 1024 / 1024
"Data Base Size in MB" FROM information_schema.TABLES GROUP BY table_schema ;

Archive:

unpack
>> tar xvfz archivename.tar.gz
pack
>> tar cvfz archivename.tar.gz folder1 folder2 …

But there is nothing really specific to gentoo here, and there is much more to linux administration than this. You should also find many much better manuals arround, which will cover this in much more detail.

Updating Hardware clock

Checking the time:
>> date

Checking the time on a time server:
>>htpdate -d -q www.linux.org www.freebsd.org

Update the time, simply replace the -q with -s
>> htpdate -ds www.linux.org www.freebsd.org

gentoo web – administration

I have been buiding websites since 1997, hosting them on my own servers since 2001, using dedicated servers since 2004 and OVH Gentoo Release2 since 2006.

Why use dedicated servers instead of home hosting?

- cost: electricity + computer + broadband connect 24/7 turns out to be more expensive than renting one
- security: open ports in your local area network is a security risk to the other computers on your network
- bandwidth: hosting services provide 100M up bandwidth, home is limited to 256K and shared with your own use.
- maintenance: a computer on 24/7 is more likely to break, if you need to replace something or if your connection goes off, you will get people to sort that out for you.
- comfort: If you live in a small place, no need to have a computer on 24/7 in room you are living in
- static location: over the year, you may be more likely to move or to change ISP, no need to change all your domain name or use a dyndns.

Why use OVH release2?

I have been using the ovh release since 2003, first

Release 1: redhat 7.4 form 2004 to 2007
Release 2: gentoo from 2007 onwards

The advantages of the ovh releases are:
- ready to use immediatly for webhosting with all applications pre-installed
- one step add a domaine, and configure all application with “ovhm”. Email/Website/dns/webanalytics
- webmin: graphical interface to manage the server
- Good for linux newbees

Down sides:

- Gentoo, not easy for linux newbees ( ubuntu or redhat much easier and standard for begginers )
- Debian probably more standard for professionals
- As soon as you need to do something new, you tend to go out of the initial release

How to manage updates for Gentoo?

Gentoo is really all about it’s way of managing packages which “emerge”, which does not work in the same way as “aptitude” for debian based, or “yum” for redhat based systems.

“emerge”  will and re-compile all packages from skratch on every update. If you manage your packages properly you will therefore be top speed on your system. And understanding emerge, is the key to handling gentoo properly.

You first have to understand that all your information is stored localy in your portage tree, which you can either update using:

>> emerge –sync;
or better:
>> eix-sync;

Once this is done, you can update your system:

>> emerge –update –deep –newuse –ask world;

You can also update just part of work such as
>> emerge –update –deep –newuse –ask system

But I tend to just do the entire tree directly.

If all worled fine, you will then need clean your system from old obsolete packages on your disk:

>> emerge –depclean

You then need to rebuild all the packages to make sure you don’t have any old references lying arround, so best is to rebuild all the old packages:

>> revdep-rebuild –library liblber-2.3.so.0

But you can also try and do the entire system at once simply using this:

>> revdep-rebuild

You will also need to replace your configuration files after the system has been updated.

The best tool I have found to do this so far is:
>> dispatch-conf

You can also clean the list of packages loaded by running

>> eclean-dist –interactive
and
>> eclean-pkg –interactive

Web development team

It is worldcup frenzy time.

And watching all these matches really makes me think “a football” team works just like a web development team.

If we strip off the context, then at the end of it we get exactly the same thing. A team playing their best for a simple objective. Put the ball in a goal

For a start a football team:

11 players all specialised in their own speciality (goal, keeper, striker, wing, defender). & Coach a trainer, manager, all playing with one objective: Win as many matches as possible ..etc.

A web development team works in the same way.

Many players (systems, expert, developper), the project manager would be the coach, manager, trainer. All playing with one objective. Put the website online.

These all have their similarities. We need to select the best specialist for the task which we are doing. We need to train them. We need the developpers to run the sprints as fast and as well as possible. The coach is there to bring support to the team., and make money (satisfied customers or supports on the other).

There are a huge amount of similarities between these 2 fields (catering, building, police, hospitals), but at the same time this can be said for almost any other job. It is just simple management really.

Future of software is in the clouds

This article seemed to have it all said!

http://googleenterprise.blogspot.com/2010/05/upgrade-here.html

Microsoft and google have been thinking for a long time. And this shows the Microsoft strategy as well as Google’s.

The success of microsoft in the 90s and noughties, is highly based on they software.

You get yourself a computer and a few servers, and we will provide you with good software. If you cannot get yourself the hardware.. then they cannot be blamed.

This has allowed very successfull decades of good Microsoft software.

But now everyone has a computer, and access to a fast internet connection, the issue now to not have to worry about those old servers any more.

Let them handle the hardware “in the clouds”, we don’t need them anymore. And that seems to be the way Microsoft and Google we be heading.

Google sites

I had the occasion yesterday to help out someone with their website.

Which they created with google sites: http://sites.google.com

I was at first very impressed, although I have know about this website for a long time, I hadn’t had the occasion to actually test it, with a live site.

Google websites are as always very well done, and documented, and easy to use. But at the heart of the website, there really wasn’t much to the site. It was in fact just an advanced online editor. And by using online editor, it would probably be pretty quick to reproduce an alternative to this site.

It was also very limited in what it was allowing users to do, there seemed to be no way of adding javascript, a different web statistics tool, or any type of dyna;ics.

But the simplisity of it was impressive, but this showed that there was most probably place for an alternative and a more advanced tool once again.

I the same way as I could see the charts.google.com tool, really interesting and a fun basis for a more advanced statitics tool for developers.

The 3rd website, which could also have made progress is definitly googlemaps, since the API, requires so much tweeking to get anything to work, but once again, I am sure that this team is actually already working on making progress on this.

7 wonders of the world

Maybe it is because I am approching 30 and need to start thinking of acheiving great things in my life that I am currently drawn to great acheivements that some people have done.

And I never really seemed to remember what were the real 7 wonders, so thanks I thought I would have a quick check at my facts, and I was very surprised to discover on my first search that the eurostars was concidered one of them.

That is of course since many different lists exist.

And this was the list of the American society of civil engineering. The original being refered to nowdays as the 7 wonders of antic world. Other interestings lists include the 7 wonders of midle ages of of the modern world.

Here are a few interesting lists anyway:

7 wonders of ancient world:

One interesting one is definitly the Colossus of Rhodes, since I have also been very interested in the sea recently. And I would be very interested I like the idea of boats having to go under a statue to go in a Port.

Here is a list of 7 wonders of the American society of Civil Engineers:

Wonder Date Started Date Finished Location
Channel Tunnel December 1, 1987 May 6, 1994 Strait of Dover, between the United Kingdom and France
CN Tower February 6, 1973 June 26, 1976, tallest freestanding structure in the world 1976–2007. TorontoOntarioCanada
Empire State Building January 22, 1930 May 1, 1931, Tallest structure in the world 1931–1967. First building with 100+ stories. New YorkNYU.S.
Golden Gate Bridge January 5, 1933 May 27, 1937 Golden Gate Strait, north of San FranciscoCaliforniaU.S.
Itaipu Dam January 1970 May 5, 1984 Paraná River, between Brazil andParaguay
Delta Works/Zuiderzee Works 1950 May 10, 1997 Netherlands
Panama Canal January 1, 1880 January 7, 1914 Isthmus of Panama

Interesting to see read about the CN Tower and the Panama chanel, although this list will most likely be the first to be outdated, since most of these buildings have since already been shadowed by others such as the incredible Burj Khalifa.

But the following list would seem to be the list which most people would remember, since it includes the great wall of china, Machu Pichu, the Pyramids and the Colosseum.

Wonder Date of construction Location
Great Wall of China 5th century BCE – 16th century CE China
Petra c.100 BCE Jordan
Christ the Redeemer Opened 12 October 1931 Brazil
Machu Picchu c.1450 CE Peru
Chichen Itza c.600 CE Mexico
Colosseum Completed 80 CE Italy
Taj Mahal Completed c.1648 CE India
Great Pyramid of Giza (Honorary Candidate) Completed c.2560 BCE Egypt

I still find it facinating how peoples ideas can transform in some great projects. And therefore create great positive things.

URL shorteners

As I got up this morning. Easter sunday 2010.

I had a great idea.
I was going to build a url shortener website, and add statistic tools to it.

This seemed to be a dead easy thing to do, and I am sure, with my technical skill and knowledge to build a better one than all those which exist online.

What is the point in a url shortener site:
1) shorten urls to make them easier to “tweat” or text
2) log information from the users who visit your site: location, browser, page on which they clicked
3) many other things like this link suggests:

http://www.friedbeef.com/top-5-url-shorteners-and-how-they-help-you/

I will start by choosing a cool web name:

such as:
sho.rt, qui.ck or smi.le

But as I surfed the web, and looked at what was already out there, I discovered those which already existed:

These are the ones I found:

The leader:

http://bit.ly

http://j.mp

Google’s:

http://goo.gl

Other:

http://is.gd

http://sn.im

http://go.to/
http://start.at/
http://i.am/

But this one alerted me that it was maybe not such a good idea after all, since it can lead to too many spammers using the tool:

http://tr.im

I still think it could be an fun to create something better, and the competition doesn’t seem that high.
But it may just take a bit more time.
I also think this could be the perfect project to test out my shinny new dedicated server.

I will still keep the idea at the back of my mind, but maybe not for this weekend.

Anyway, another interresting aspect of these websites is the use of exotic top level domain (tld) of  countries to create interresting urls such as:

.ly for lybia
.gl for greenland
.im for mariana islands

http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains

The last interresting point will have been the actually domain name hunting:

I was pretty surprised to see how many tld ovh.com now had on offer.

I still have to found out how to get the .ck for qui.ck for the cook islands.

I was sorry to found out that the .le did not existe yet, since this could have been used for very interrestin names.

I will definitly keep this in mind if I ever get to set up an interresting website on day.

Internet Junky & homeless

I have never felt so close to being an internet junky.

I have been in Paris for less than 24h. I have spent most of my time here trying to get a clean internet connection, and spent nearly 150€ trying to do just that.

I have just bought myself a new 20 minutes internet fix for 2.5€

I have a couple of things I need to do before starting my new job on monday: Read and learn my mission brief documents, and then prepare for the mission. Then fix a couple of small bugs, in order to finalise my previous job. This requires a quick internet connect, download a 1.5Gb archive, deploy it, set up a copy of the site in order to test the changes, fix 2 or 3 small things, recreate an archive and send it to the client or put it on the ftp space.

If I had been on a clean internet connection this could have taken me less than an hour, but in travelling conditions this can be an exhausting job. And I could have spent a few hours doing other usefull thing on other projects.

How did I manage to spend 150€

- mobile phone number: 15€
- cybercafé to get a hotel room: 3€
- hotel room with wifi: 85€
- recharge mobile: 10 €
- wifi option in hotel: 15€
- 3€ another card to get password of wifi

Although I have free internet with “neuf” and “free.fr”, since I have the logins and passwords, and there is free internet all over Paris, in parcs, train stations, mcDonalds, starbucks … etc. The wifi option of my hotel is in fact a partnership with orange, and this means that you need to buy small doses of 20 min or so and charge up gradually. I thought I would just buy 15euros yesterday, and I wouldn’t have to think about it again, and could happilly get a lot of my buisness done yesterday.

But since I didn’t write down the login and password. I had the bad surprise this morning, to realise that I needed an internet connection to get my password back. And this means that I had to go and try and hunt down the other free alternatives to get my password back. But to my suprise the station and mcDonalds wifi did not work as easily as expected and after over an hour looking, and my frustration growing to a true adicts level, of I need my internet now. I bought myself another 20 min dose, just to get my password. And this really felt like getting a short fix.And I think I really got to understand the parallele between drug adiction and internet adiction one bit more.

It felt at this point that “orange”, and the mini doses was only truely destined to true internet adicts, and that only people with this true adictive feeling would go and get a 20min fix for 3€, or 1h for 5€ like I did this morning.

Finding a solution to give internet access to all, like the other companies do, could be a solution to solving this internet adiction. But the internet adiction probably a much more complexe thing that just helping those in need. Since this internet adiction will just grow. People who break their computer, or find themself in remote aeres, with electricity shortage, most likely experience similar feelings of frustration.
And giving the dose to the people in need feels is only a short term solution. If someone was a drug adict, the solution wouldn’t be to give them their drug, it is far more complexe.
I beleive this is exeptional for me, I have an important issue I need to solve by tomorrow.
So this is exeptional, I think in my case, and beleive I can more or less cure myself. But I am sure not everyone has the same self control.

Looking at all the homeless people in Paris this morning really made me think.
All these people, probably just had a similar frustration point at some point at least when they lost their home or job. And at some point they lost control, and could not or would not ask for help. Either because they couldn’t or out of pride. And the longer they stay in this situation and the more difficult it is for them to solve they problem.

They probably all need to learn how to manage they money, or whatever has cost them so much. When you lose control of things, you tend to just spend money in a rediculous way, like I did this morning or this weekend, in a way that no one can really afford. And you will always have a company out there who is willing to help you, and take everything they can away from you in a legal way, but for a rediculous amount of money, just like “orange” has been doing it with me today.

I can see now, how information control could probably solve this. People tend to just not know how to count. They are overtaken but problems which they did not manage to predict. Everyone how has lost control has most likely fallen in this situation.

If we had a tool which could tell us, what is going to happen next, and which type of dangerous situation we are falling into (drugs, internet, television, gambling, food alcohol addiction), mesure this type of adiction or danger. And everyone woud have a grade of self control or loss of control for every dangerous situation. Then this could most likely instantly solve all these problems.

Cerda the engineer

2010 is the year of Cerda in Barcelona.

And this is the second time this year, I go and see an exhibition about him. After the CCCB in January, this time it is at the Museum del historia de Barcelona, next to the town hall.

I was already very impressed by the previous one. Cerda and the Exiemple. Which really showed, how he created the famously square looking town which everyone can know. Which really helps you understand Barcelona. Which shows you how it works, how people move in it. How it grows, how people interract. Statics about the best blocks to live in. The most dense, the most mixed. Comparison with many other grided towns. I came out of the previous exhibition very impressed by the analysis of the town. The ghost underground stations. And this gave me a true new vision of Barcelona.

But I was suprised to learn much more once again in this exhibition.

This exhibition was much less focused on todays Barcelona. But really focused on Cerda himself, his time, the motives behind creating this town, the 1850s to 1890s. A true understanding, not on todays Barcelona, but more on how a few good men. Engineer, Mayor and the people in power, but to totally transform a town. Go against the authority of the military and decide to cut open and reshape an entire town.

The original plans to create the “Road Princesse”, which cuts the Barrio Gotic in 2. Was a true revelation for me.

How could a few good men, change, plan, convince, enginneer, a plan which would knock out half the town. This is probably very similar to what Haussman must have done in Paris.

But for me I think it ment much more. Maybe it is the time in my life, the fact the my brother is a town planner, and that I am an enginner with little power, and doing projects with little impact.

But I do feel that there always needs to be some people who need to get in there and convince to do the great changes which are necessary. And that if you have the right arguments, that you know that you are right then you need to convince.

I was very impressed by the atmosphere created in this exhibition. Because it really showed the time in which the decisions were taken. That even nearly 200 years ago, before electricity, before trains, before cinema, and a huge amount of other things which feel which are part of todays world and our own history, which hasn’t happend yet for them, the first world war, before bombs, tv, cars. Some people really knew what was necessary, some politician could still really make impressive decisions and change and reshape not only the distant future in which we now are. But they very near future in which they lived, their present.

When you look at that huge pannoramic photo of the town in 1860, the maps, the old pictures animated in the film. You could really see the way the town must have looked, and felt like to live in, not today like in the previous exhibition but in those days. With the walls surounding the town, with the life expectancy stats, the air they must have been breathing.

Well done to them, and I hope I get to do something usefull in my life time, like all the people who worked on this managed to do.

I will never look at the center of Barcelona in the same way again.