Feeds:
Posts
Comments

With a desire to spread linux across young aspirants, GLUG-T (GNU Linux Users Group Trichy) conducted a certification workshop on basics of linux commands and shell scripting. The four days workshop noted the first with installation of linux distros partitioning in their laptops. The workshop had short courses on Basic shell commands and essential GUI applications, networking in linux and basic shell scripting and python. As an attempt to spread the need of open source and its sustainment, we inspired them on working with linux and spreading its necessarily to other students. This year workshop had been successful.!! 🙂 Cheat sheets on linux commands had been distributed to the participants.. stay tuned for linux tee shirts.. 😉 Encourage a world where piracy is genuine..!! 😀

PS : Participants, find below the link of attachments of the ppts used in the class.

Linux File Hierarchy Shell Commands UNIX Shell-Scripting Basics Vim ssh sshvnc

4day class on the workshop

It was steve job’s death that shook the computer enthusiasts this year. As an ode to his death, CSG team, Pragyan’12 is sandwiching some of the apple designs with the Pragyan’12 website. I have designed a light and sleek front-end for this year’s site. Now I shall take you through few of the snapshots during the site design.

Starting with a light weight pragyan site banner design with a light grey (apple’s favorite) base color.

Menu bar of the site designed in imac dock style.

Many of the button styles and drop-down styles were taken from iphone and itouch button styles.

Count Down timer design for Pragyan to start is imitated from Apple inano stop clock design.

Snapshot of overall site layout design

Hope you enjoyed the tour!! 🙂

It was very early in the start of my fifth semester; Facebook and Microsoft were first face show-ers who came down to campus for Internship and Placement Recruitments. They put my mind in pressure and excitement looking at these unexpected opportunities extended only to Computer science students of NIT Trichy. More easing to know was the GPA cut-off relaxation to 7.0 for both the companies.  “Phew! Atleast I now get a chance to sit in the interviews than merely being eliminated based on GPAs alone.” My first eliminative round for facebook is an online round containing one ACM style question allotted one hour. Question was simpler to be solved in an hour, however as I had focused very less in solving ACM style questions since then, I suffered in string related formatting for an automated input and output case. I know that’s a very lame reason to miss out FB in eliminative round itself. So one thing I learnt about interviews is basics of Algorithmic knowledge are certainly necessarily, if you work on other computer fields, for you to reach till personal interview atleast. My second eliminative exam was Microsoft written examinations comprising two separate exams for two separate fields inside Microsoft – India Development Center (IDC) and Microsoft Information Technology (MSIT). IDC question paper had algorithm optimization essence including program testing and Binary tree data structures. MSIT comprised a 10 mcq (+4 -2) pattern with deceiving choices and written paper with algorithms, data structures (much of classes, linked list and trees) and of course algorithm testing again. Hours writing this written exam pushed me tired with a little disappointing to see everyone in groups discussing the answers that I made wrong. I learnt the same one thing after then. Algorithms and data structures are life savers to clear eliminative round in most IT companies.

So to an extent of expected and a little of unexpected result, I cleared Microsoft IDC written round and was called for tnp (Training and Placement Cell) for my first IDC interview to be started within next half an hour informing me through a text message. And yeah, this is tnp style of informing for the Interviews giving you not more than 30 minutes to get yourselves ready from your casual clothes and soak yourself in neat formal pants, full hand shirts (with an aching tie) and formal shoes. Since I was unaware of my proximity in clearing the eliminative round and these sudden information styles of tnp, I had to clean trim only 15mts before the start of the interview round. Rushing from octagon(computer center) towards the tnp cell with printouts of my CV and in my alien interview costume, attracting the passers, stepped in to tnp cell for the first time. Oh everyone were still waiting for the Interview to start! Attribute two of tnp to know – schedules for interview turn out to be random variables at last. But however random it is, it should be believed as a myth to be constant variables so as not to risk your Interview schedules. The first one was a Group fry – Program solutions to be written comprising questions in Optimized string manipulations and a not so hard Tree problem. Along with me 7 more were qualified for one on one interview scheduled next day early morning. Same night saw me brushing with data structures and algorithm concepts to equip myself for the interview next day being not aware of the disappointment and relief one each that was set to come. So the next morning experienced a 15minute delay to tnp, counting on the random variable. Yeah I had to wait for an hour for my second IDC interview. That was my first personal interview ever – turning me hell nervous. Personal interview had me shivering a bit before the interviewer. Questions were on general algorithms and software testing cases. Solutions with optimized algorithms were his expectations. With my least concern on code optimizations till then, I did not stand a good chance for providing an optimized solution for the problem., It along with followed disappointment that the Interviewer did not go through my CV. I was eliminated in this round qualifying 4 of my batch mates to the final interview (Grr!! with an irony that all were 9 pointers) And now came the relief that I had also cleared MSIT eliminative examination and my first MSIT interview round was to be scheduled in a while ( a while (tnp language) <=> up to few more hours(compiled machine level language) ). So after a while of waiting, I was called in for the MSIT first personal round. Interviewer posed questions on Tree data structures and Class structures., I was able to with stand to a pretty good extent. I had expressed my current interest then was web technologies. I explained class architecture of my favorite RayTracer C++ project (IIIT-b summer project). Positively I was qualified for next round of my MSIT interview. A little more of ‘a while’ waiting next personal interview began. As how I expressed my interests in my first round., my second round was on Class structures and Web technology experiences. Interviewer seemed cool and he fulfilled the requirements to keep me remained in the hot seat. I was informed to have qualified my second round as well. The final one was a telephonic HR interview scheduled in a lot more ‘a while’-s (tnp language of course). I was stuck up with confusing thoughts – charted out points to be spoken in the HR round. Glancing across those points, I was waiting for the phone to ring. And very shortly then started the Final Interview round for the Internship. This began with casual questions on my technical favorites and interests. I explained each of my projects and its architecture that I had worked on. Later it turned personal., to my family, to my blogging, to my website follows, to my academic interest, to my artistic interest, to my facebook profile,. “Why don’t you use Microsoft Products?”-(Paused few seconds)“I use them, sir! Photoshop and Flash is in Windows7”-“No! I meant the technologies. Its fine! ”. “Do you blog?”-“Yes sir! Its dineshprasanth.wordpress.com” . “What do you feel about Microsoft Products?”-“I feel the Microsoft developer tools are really good in equipping with necessary aids helping the budding developers. I heard this kinect technology for gaming that Microsoft works on, Experimenting hands on new technology really impressed me”. The interview ended quite soon.

It was already very late at night. I was asked to wait another hour more before the cell. Lots and lots of thoughts waved across me in this one hour. This could end in excitement or disappointment at the end of the day after struggling my entire day on interviews and being so close for the selection. If I was in, atleast the procedure of waiting for interview nervously in tnp and this clown suits would come to an end. Production sec rep laughed at me seeing me so nervous – “Its only Internship man! Relax yourself, Read this newspaper till they call”. Of course I was not patient enough then to read anything in the paper. I knew myself that I worked on open source technologies and my blog had articles were on Open source technologies. Microsoft least works on open source and I was not sure if it really needed me in the company. It was atlast only two people with the sec rep waiting late at night in the cell. “You are dinesh right? Please come in.” Sliding myself before a round table containing panel of seven to eight interviewers everyone looking at me with some guilt, Weird questions were posed. “You made pretty well in the personal interviews and what really went wrong in the HR round?”  “Was the interviewer fine with whatever you spoke?” “Was there any misunderstanding?.” I could feel nothing in my head to answer them. It was gone at last. “No sir! I answered good enough and interview did go mildly well.” They could notice my face turning red. One in the table burst laughing looking at me, “That’s all for now! You are in, Welcome to Microsoft”.

My first open source conference to attend apparently happened to be foss.in that took place at Bangalore between 15th – 17th December 2010. I can add this conference as one of the interesting stuff that I involved in myself during my winter holidays. I was allowed a student delegate pass for the conference. I shortly quote it as ‘A perfect conference to get the feel of open source and interact with highly efficient OS contributors’.

FOSS Organization

FOSS – Free and Open Source Software is the liberty for license that gives any developers the right to use, collaborate and improve the code of the softwares. The organization branches to foss.in in India. It was early 70s when came the custom of paying for usage of the softwares and operating systems and for well known reasons the growth of  softwares for commercial purposes was fruitful. Richard Stallmen (RMS), longtime member of hacking community at MIT AI laboratory announced the GNU project claiming for his frustration of change in culture of computer industry. 😛 And came up there ‘Free Software Foundation’ which aimed at development of free softwares and extending the philosophy of need for such softwares. Linus Torvalds released Linux kernel as freely modifiable source code to the public under GNU public license benefitting other commercial organizations. There was a software revolution that began on 90’s that made the hardware industries shift and make profits out of less investments. Developers of open source softwares experienced the spirit of coding and accreditation that the open source gave them than just Money.

The FOSS highly aims at building developers for the open source projects and enhance the philosophy of need for open source among people. FOSS.IN attracts thousands of delegates all over India to attend the event and it is a non-commercial event organized and run entirely by FOSS community volunteers. Every year it maintains the quality of lectures in the conference which usually is anchored by worthy developers of KDE, Gnome, Linux kernel and other open source projects.

FOSS.IN 2010

Foss.in 2010 was again at Nimhans conventional centre, Bangalore. Another attraction of the place is that its very close to Forum mall 😀 . Conference showed delegates both from software industries and collages. I cannot compare with last year’s foss.in conference for the reason that this is my first foss.in experience. Sponsers of 2010’s foss.in were Nokia, HP and collabora. It showed up stalls of Wikipedia,Gluster,Nokia and HP. Wikipedia apart from joining the lectures in encouraging contributors through Wikimedia , they showcased WikiReader, that is a small handy device for offline Wikipedia surfing lately released in the market. Nokia came up with their latest open source Mobile Operating System project Meego version 1.0. Gluster with their scalability of storage server technology that was hard topic for me understand much though 😛 . Hewlett-Packard showcased their Touchsmart laptops (Reminded me of i-pads!! :D) and experimented shared desktop projections on four different screens. Linux’s open source distro Fedora 14 was also part of foss.in lectures. Few of the lectures that pulled me in were Improving quality of video calls, Fedora Project, Security in Mobile Devices, Meego Development, Julian Assange Wikileaks, Technology of Wikipedia. Although 60% of the lecture topics were out of my reach, I felt foss.in as a perfect environment to stand.

Hats off Team

The hospitality of the conference is surely to be mentioned. The conference hall was centrally air-conditioned and wifi facilitated ( 50kbps download! 😀 Yup! I tried downloading stuffs!! 🙂 ). North Indian Buffet was arranged in the afternoon. The guidance and care of the Foss team is surely appreciated.

Loads of Goodies

Another best thing about conferences are goodies. Foss infact had lot of them. Adding from Wikipedia, Meego and Gluster t shirts, it had Wiki, Fedora and Meego OS, badges and stickers. Delegates were complimented with foss.in mug and calendars :D. More?? Meego SDKs were given in a 4GB transcend memory stick rather on a compact disk. Now I got another stuff to tell you.. “Hey friends!! You missed foss.in 2010!!!” 😛

Other Entertainment mentions are Raghu Dixit and Fahrenheit Band performance. Performance of Aanjhan Ranganathan (known as ‘tuxmaniac’ ), who immediately flew to India when he was roped for the Keynote ‘A Hacker s Apology’ is a sure mention. His speech on need for open source and keynote of foss.in gained large applause. I recommend everyone to have a look at this even if you had missed foss.in.

These conferences are surely a motivation for young opensource enthusiasts giving people a better platform for best motivations. I cannot wait for my next open source conference!! 😀 Thanks FOSS.IN!!!!

Links to know more about FOSS :

Foss.in official site

Foss community India

Learn about FOSS

Involve in Fossology

Involve in Linux Foundation

Google has defined an algorithm to index the ranks of web pages by crawling efficiently than any other search engines. Ranking hundred million pages is a challenging task. Lawrence Page and Sergey Brin have defined Google page Ranking Algorithm in their Stanford University paper .

Web search engines : Needed a change

Search engine technology needed a drastic change from 1994 to 2000,By Nov 1997 Search engines listed millions of web documents to be indexed during a search which started just from 1,10,000 in 1994. The number of queries per day from the users for the search also tremendously increased from a rough 1500 per day to 20 million queries per day and was estimated to hundreds of million queries by 2000. WWW needed an efficient algorithm for the search engine to function accurately. Google  had invested properly on the hardware performance and storage space but was in need of an algorithm to efficiently use its resources to maximum potentiality.

The Algorithm

The concept of anchoring is the root need for the definition of the algorithm. The linking of pages in the web is taken for prioritizing the pages thereby not requiredly categorizing only text based web documents.

PgRk(A) = ( 1-d ) + d(PgRk(K1)/(N1) + …. + PgRk(Kn)/(Nn))

The formula is not much complicated as it looks. Remember that A page is not manually ranked or by its total number of earlier visits by the surfers. The above formula looks recursive. Yes it is! So one can say by looking at the formula that page rank of a page is also decided by page ranks of someother pages. Those other pages are the pages  that contain a link to the page A.

( Follow the Notation: | A – Random WebPage | Ki – ‘i’th page that link to A | d – damping factor | PgRk – PageRank | Ni – Number of outbound links on page Ki | )

Next important thing is that the contribution of the pages that direct to page A in improving the A’s rank certainly differs.

1. Page rank of A relates directly with the page ranks of K.

2. Page rank contributed by each K will be divided by the number of outbound links in that page.

We infer that , a page of higher page rank, say like facebook, when has a link to your webpage then your webpage’s rank will improve considerably high. Consider when a low recognition website has a link to your webpage, still your page rank is going to increase but not considerably. What interferes again is the number of outbound links present in the site. When a site having a good page rank is linked to say 100 sites, then its page rank contribution to each site is split into 100 parts making its contribution not much effective. So only because when a higher page ranked webpage has a link to your website it doesnot mean that your page also will be ranked higher.

Justification for the algorithm

Lawrence page and Sergey brin gave a proper justification for the algorithm considering the importance of a page from the view of a random surfer. Random surfer denotes here that the clicking of links by them has no regard with the content of the page. So in this case just forget about the content. The probability for the surfer to visit a page depends on page rank of the page and also the probability that the surfer will click on a link in the page depends on number of links present in the page. This is why the page rank of the parent page has been divided by total number of links in the page when contributing to its child page. Thus the sum of the probability of random surfer clicking on the links following this page.

Now comes the factor why we include a damping factor further reducing the parent page’s contribution. Its obvious that in real world situation a surfer does not keep on clicking links continuously one after the other. When a surfer feels he is enough with the surfing or interested in some other page at random, he will stop clicking the links. We introduce this constant d, which is probability of a surfer to continue clicking the links ( lies 0 to 1 ). This is the reason why constant d is multiplied with the term in the formula. We can also conclude that not only that a page has a page rank contribution from the links of parent page but also the probability that a page can be clicked randomly by the surfer without using links ( remember the case we considered when the surfer stops clicking the link continuously?? ). So every page will have some minimum page rank even if there no links for the page directed from a parent page.

Next version of Page Rank Algorithm

In the other version of page rank algorithm submitted by Lawrence and Sergey ( different paper ), the algorithm is put as,

PR(A) = ((1-d)/N) + d(PgRk(K1)/(N1) + …. + PgRk(Kn)/(Nn))

N – the number pages in the web

Except for the new notation N, all other variables has the same effect in the formula. The notation is included because the probability of action on random surfer clicking a random page after he is bored of clicking links is equally divided among all the pages in the web. This version of pagerank deals with the probability for the surfer to click on a page when he restarts the search for pages. That is, in a web comprising of 5 web pages, if a page has a rank of 3 then it denotes that this page is being clicked 3 times every 5 times the surfer restarts the random page click.

Model of Page Rank Algorithm

A model will explain clearly the functioning of the algorithm. Consider 3 pages page A, page B and page C where A links to both B and C, B links to C and C links to A again.The damping factor is usually set to 0.85 according to lawrence and sergey, but owing to the ease of calculation lets make it 0.5. This does not affect the fundamental principles of the algo but will surely affect when the value is altered in real world cases. We use first version of algorithm as it is easier for calculation both in this case and real world situation.

consider x, y and z denote PR(A), PR(B) and PR(C)..,

x = 0.5 + 0.5*(z/1)

y = 0.5 + 0.5*(x/2)

z = 0.5 + 0.5*( x/2  +  y/1 )

These are 3 variable with 3 equations, thus can be solved…

x = 1.07692308 y = 0.76923077 z = 1.15384615

As this is just a three variable equation it can be solved easily. But considering real web page situation, there are billions of web pages leading to billion variable problem to solve… This keeps our mouth wide open!! Are google servers that are set up enough to calculate the problem?? or even if it calculates can it provide speed in its web search?? Even if it gives speed to one request can it give same speed to thousands of request coming to the server every second??

Approximated computation of Page Rank

Answers to all the above questions are obviously not. Google search engine uses a approximative and iterative computation in finding the page rank. Thus each page is assigned an initial page rank that can be used as a initiater for the iteration. Again considering the same example of three pages where each page is initially assigned a page rank of 1,

Iteration x y z
0 1 1 1
1 1 0.75 1.125
2 1.0625 0.765625 1.1484375
3 1.07421875 0.76855469 1.1484375
4 1.07641602 0.76910400 1.15365601
5 1.07682800 0.76920700 1.15381050
6 1.07691973 0.76922631 1.25383947

We brought the approximated result to 3 places as compared to the value of original computation as in the above method. An approximation on 8 places can be achieved by 12 iterations. According to Lawrence and Sergey, a good approximation of page rank can be obtained after 100 iterations itself in the real web world situation.Highly linked pages’ page rank convergingly increases from 1 and lowly linked pages’s page rank convergingly decreases from 1.

Results of the Algorithm

Most important feedback of this algorithm is improvement in the quality of the search. The best and common example was “bill clinton” as a Query for the search. The earlier search engines resulted in pages of content containing “bill clinton sucks” and “bill clinton joke of the day” in its highly indexed top search results. But this should not have been a standard search result for such a Query. The query only wanted results of information that consist general details of Bill Clinton. This was very well rectified by the new search engine where priority is given in terms of page rank of the page where the information is hosted. Results were “whitehouse.gov” site pages in the top page of the search result. Thus the quality of the search has been achieved.

More mathematical conclusions and facts

  • Sum of all page ranks is still ( approximately atleast ) equal to the number of webpages indexed , this means the average of the pageranks is 1.
  • Next, From the algorithm formula the minimum page rank is obtained when

PgRk(K1)/(N1) + …. + PgRk(Kn)/(Nn) = 0

this can be obtained when no pages link to the page A. But still the page possesses a minimum page rank ( 1-d ).

  • Thirdly, Maximum page rank can be obtained only when

PgRk(K1)/(N1) + …. + PgRk(Kn)/(Nn) = N

this happens only when each term in the summation is equal to 1. Hence this is theoretically possible only                             when all pages in the web are solely linked only to one page and this page also must link to itself (practically    impossible).

  • The PageRank value difference from pgRk 1 to pgRk 10 is not constant, Researchers claim the scale of increase to be logorithmic., Hence its possible that even if a page of PgRk 8 having lot of outbound links can have higher contribution that a Page of PgRk4 with lesser outbound links. Its also true that its difficult for a page to improve to next page rank than the difficulty it took to improve from previous pagerank.
  • Also know that when a page contributes to as many child pages in increasing their pageranks, the page rank of the contributor does not get reduced. In short words its ‘sharing without losing them’.

*surfer – denote ‘random surfer’

References

Contributors of the Algorithm

Lawrence Page
Sergey Brin