Handling Data in Modern Times
Data is information that can be analyzed to make business strategies. With the developing technology, various types of data are continuously being collected. With this vast plethora of data being available, new strategies, and also questions are being raised about privacy and new ethical regulations.
Current businesses rely heavily on data analysis and hence can easily be called data businesses. Any business that can derive meaningful patterns from a non-traditional data source can have an edge over its competitor. Current technology makes data analysis through artificial intelligence possible. Before that, current technology is also making data available through unheard-of sources – like a wearable device like a smartwatch can collect biometric data and identify diseases. It can also further analyze the biometric data and help manage diseases after their detection.
Such new approaches open up a rush of new concerns that need to be addressed before ethical boundaries are crossed. Responsible data handling requires one to weigh the negative consequences of such data handling. So, a code of ethics for data sharing is needed to make sure that data handling from unusual sources is also handled to not cause harm or curtail privacy.
Some novel analysis examples from unusual sources are made in the field of policymaking and urban life. The government often uses data-driven simulations to gauge disaster preparedness. Recent data analysis has helped policymakers anticipate the effects of regulations and other complex policymaking on the public.
Uses of Data Handling
Another use of data is a better understanding of a problem to find a solution. A recent study done using the largest ever global traffic data found that traffic jams cost the US 305 billion USD each year. If the traffic problem can be solved, that can potentially improve the economy of any government.
Another example of brilliant urban data handling was revealed by the artificial intelligence-based census data analysis by Stanford researchers. They were able to accurately guess the race and estimate the voting patterns of neighborhoods through the analysis of census data from 35 cities.
As a more commerce-based analysis, data was collected from machine learning and spatial economics to better understand the location and concentration of certain kinds of businesses. Another brilliant use of data is using digital biomarkers to aid in the medicine industry. Digital biomarkers are the biological, physiological, and psychological data of people who use smartwatches or other connected devices. Pharmaceutical firms and medical professionals can better help patients using such data.
Smart thermometers are digitally connected to smartphone apps and one such thermometer can track flu season in real-time and with better results. Such information can help the public better handle seasonal illness and take precautions with better avoidance results. Digital phenotyping is such a new field – it gleans information from digital devices and helps judge the health risks of users.
An app can read the typing speed and map it with manic and depressive behavior. It can help people better cope with depressive tendencies. Facebook uses AI to go through social posts for language and gauges suicidal risks in users. With new tech and sources of information comes the responsibility of using such information without crossing ethical boundaries.
The foundation of data analysis in statistics lies in the collection of data. Data is nothing but unorganized facts and figures which are collected for a certain purpose, like an analysis. The medium through which data is collected is termed as a source of data.
Sources of data are of two types; they are as follows –
Statistical Data
This type of data source refers to the collection of data that are used for official purposes, such as population census, official surveys, etc.
Non-Statistical Data
This type of data source refers to the collection of data that are used for various administrative purposes, mainly in the private sector.
Different Sources of Data
Sources of data can also be classified based on its collection methods, which are –
Internal Sources of Data
In several cases for a certain analysis, data is collected from records, archives, and various other sources within the organization itself. Such sources of data are termed internal sources of data.
Example: A school is performing an analysis to figure out the highest marks achieved in class 8 science subjects for the last 10 years.
External Sources of Data
Data may also be collected from various sources outside the organization for analytical purposes. Such sources of data collection are known as external sources of data.
Example: As a patient, you are analyzing the price charts of your nearby hospitals for the treatment of ulcers.
Check Your Progress
Q. What are different types of data sources?
Ans: Sources of data can be categorized as per two basis points, i.e. purpose of data collection and type of data source. This can be explained with the help of an illustration given below –
Types of Data
Data can be classified into two types –
Primary Data
Data which is considered as first-hand information collected by a surveyor, investigator, etc. is defined as Primary Data. The sources from which such data is collected is termed as the primary source of data collection for the concerned information.
Moreover, data is regarded as primary only if it has never undergone any prior statistical treatment. Such data is usually published, and more data is derived from the published source for other purposes. For example, a country’s population is an application of the collection of primary data.
Features of Primary Data
Primary Data has the following characteristics –
Such data is being collected for the first time.
Primary Data is original and thereby more reliable than other types of data
This kind of data has not been used for any statistical analysis before.
Secondary Data
Data that has already been collected, analyzed, published, and has undergone statistical treatment can be defined as Secondary data. Such type of data is tailored from primary data sources.
However, this kind of data can also be collected by surveyors, investigators, etc. to conduct statistical analysis in order to derive newer information.
For example, the address you insert in food delivery apps is a common application for the use of secondary data. Your address is not new information unless you just purchased a property.
In such cases, information regarding the address of your new property will be considered as primary data. From this example, you can get a clear understanding of the sources of data primary and secondary.
Features of Secondary Data
Secondary Data consists of the following features –
Secondary data is considered as ‘second-hand information’.
Secondary data is not original.
This kind of data has gone through statistical analysis at least once.
Secondary data is not reliable.
Another simple example of Secondary Data is information which is found in unapproved websites such as Wikipedia, etc. where any user at any given time can edit the data, as per his or her wish, provided on any page of this website.
Methods of Collecting Data in Statistics
Data collection is a standout procedure carried out by most analysts during research. As an analyst, if you are unable to collect the necessary data for your research, your whole venture will lose its credibility.
So, data collection is an essential element in statistical analysis; it is a challenging duty that requires dedication, determination, proper planning, and the capability to finish the assignment.
The primary step of data collection is figuring out what kind of data is required and then starting your analysis by collecting a sample through a specific sampling method from a certain part of the population.
There are various methods of data collection that can be classified as per the type of data involved, which are –
Collection of Primary Data
Collection of Primary Data can be done through various methods, which are –
Direct Personal Investigation
In this method, surveyors or investigators collect the data themselves. This method is suitable for small projects where the required data needs to be reliable and excessive effort is not mandatory.
Collection with the Help of Investigators
In this method, a single or a group of correspondents collects the data for the surveyor. These correspondents are trained investigators who are employed for this course of action. This type of data collecting method is useful for a large population.
Collection Assisted by Questionnaires
When the amount of data that is required to be collected is significantly large, questionnaires are used to make the data collecting process easier. Questionnaires are nothing but a set of questions that, when answered, provide the required data. Surveyors can also mail questionnaires to the respondents for added convenience.
Collection of Secondary Data
The collection of secondary data is much easier than collecting primary data. Secondary data is available on various sources, both published and unpublished.
However, the investigator of this kind of data must ensure that the data is reliable, suitable for analysis, whether bias is involved during sampling of the said data, etc.
At Vedantu, we hope that this study material will help you to fetch top marks in the upcoming Statistics exams! You can also install the Vedantu app on your smartphone to instantly access all our best-in-class study materials.
FAQs on Sources of Data
1. What are the different types of data sources based on the type?
Based on the type, sources of data are of two types, which are internal sources and external sources.
2. State the methods of collecting primary data.
The various methods of collecting primary data are direct investigation, telephonic investigation, investigation through questionnaires, investigation through correspondents.
3. What are the features of secondary data?
Secondary data is considered second-hand information since it has already been under statistical treatment. It is not original, nor is it reliable, and usually, sampling of secondary data involves bias.
4. How is data useful?
Data can be used to make strategies and it can be analyzed for various purposes like policy and decision making for masses, or predicting disasters and health risks for individuals.
5. How is data connected to commerce?
All business strategies need data to back their rationales. Business runs on analysis of consumer behavior, which is done by consumer data.
6. How is large data collected?
Basic and large-scale data is collected by governments through census and surveys. A more novel source of data is the internet through which millions are connected and continuously pouring their information into.
7. How is data analyzed?
Data analysis uses different statistical methods. Computers can be programmed to analyze data on a large scale and more recently artificial intelligence can do a ground-breaking analysis of data.
8. What are the ethical risks behind data handling?
Data handling involves ethical and also legal risks because data is the property of the subject of the data and not the owner of the data. It involves the consent of the owner and strict guidelines about what it is permitted for. The guidelines of transparency, respect, and fairness for the correct use of data need to be strictly adhered to by all handling data in any form.