Taming Big Data with Apache Spark and Python PDF
Big Data scientists work only with aggregated, depersonalized data...
Even 10-20 thousand years ago, the ancestors of modern man used bones to record stocks to conduct trading activities...
Who invented Big Data?
Technically, big data has always existed. Any filing cabinet or registry is a kind of big data.
Babylon's first libraries in 2000s BC - an example of how people first faced the issue of storing and organizing a large amount of information.
Even 10-20 thousand years ago, the ancestors of modern man used bones to record stocks to conduct trading activities, analyze and predict food needs.
Gradually, information began to be used for forecasting. In 1663, scientist John Grant wrote the book Natural and Political Observations Based on Mortality Data. In the book, he described the theory that mortality data could be used to warn of the onset of the bubonic plague epidemic. Unexpectedly for him, the book became the first statistically valid estimate of the population of London. Find more info on that in our Big data for dummies PDF.
With the growth of the amount of information, difficulties arose with its processing and analysis. In 1880, during the census in America, they were faced with the fact that it could take more than 8 years to calculate population data. The businessman-inventor Herman Hollerit came to the rescue. He created an electromechanical tabulating machine for punched cards - with the help of which, operating with punched cards, it was possible to process the required array of information 32 times faster - in just three months. Later this invention was bought by IBM.
Since then, the information has become more and more. And the issues of data storage and processing speed are becoming more acute.
What is the value of big data?
Big data is valuable topics that reveal unobvious patterns. Knowing these patterns becomes your competitive advantage.
People are not inclined to analyze every step they take and may simply not think about what they are doing in everyday life. You can consider science fiction more interesting than detective stories, but at the same time buy five novels in a row about detectives and more than one about space ships plying the universe.
Think back to the apple pie story and read Taming Big Data with Apache Spark and Python PDF ebook to learn more.
Don't worry that big data is about what everyone in the world will know specifically about you. Big Data scientists work only with aggregated, depersonalized data. We are dealing with impersonal groups, rather than tracking the behavior of specific people. DLP systems (from the English Data Leak Prevention) are also responsible for the safety of data. They prohibit the transfer of data outside.
Therefore, no one in any form will be able to track information about a specific subscriber - to find out what, for example, Vasily Pupkin was interested in. At the same time, it is easy to create an impersonal segment of people who are interested in buying real estate or planning a trip to an exotic country. Or they move to a certain place at a certain time.