Skip to main content

Components of Data Science Life Cycle



                                           Components of Data Science Life Cycle


Data Science continues to evolve as the one of the most promising and demanding career of 21st century.
The insights drawn from the data is very much useful and profitable for the businesses when processed with intelligent algorithms to find pattern and insights from it.

The complete Data science follows a life cycle pattern which defines the steps of each stage of data and apply them to make it processed in more informative and easier way. The components of Data Science life cycle consist of five stages. Each stage have different tasks which perform on data during complete life-cycle span of Data science.

                             
                          
                                           Fig:- Components of Data Science Life Cycle

 
  The 5 components of Data Science Life Cycle are:-

  1. Data Capturing

Capture of Data from different  sources such that we derive some result from it after pre-processing the data.(including entry and extraction)
     
      The task performed during complete span of Data Capturing is:

     -  Data Acquisition
     -  Data Entry
     -  Data Extraction

   2. Data Maintain

Maintaining the data is often required when we handle with varieties of data and even the dataset  provided for analysis is staged in different format. Maintaining the data and makes it available for process and analysis is done at this stage. Pre-processing is done just after which includes data cleaning, removal or replacement of Nan values with the average value of complete column (if necessary), outliers removal, etc.        

       The Data Maintain stage of Data science life-cycle includes:-
      
     - Data Cleansing
     - Data Staging
     - Data Warehousing.
     - Data Processing


  3. Data Processing

Data may or may not be in proper format (i.e. structured data). So we have apply various techniques to processing  the data such that it becomes prepared for analysis. Processing includes data modelling,data summarization (complete summary from the format structure data),data clustering and classification in various groups.

     The data processing stage includes:

     - Data Modelling
     - Data Classification
     - Data Summarization

  4. Data Analyze

Analyze the data and finding the key insights is one of the challenging more decisive process. To analyze the data various various statistics test and algorithms performed by analyst to derived thepattern and insights from the data and the do storytelling about the analysis find from it.

       The task performed during Data Analyze stage of  data science life-cycle includes:

      - Exploratory analyze
      - Predictive Analysis
      - Regression
      - Qualitative Analysis

  5. Data Communication

Data communication plays an key importance in data science life cycle. After analyze the data the main thing is to represent and visualize the insights such that everyone understand about what the data tells (insights, pattern) and its visual representation. After that decision making performed accordingly

      The Data Communication stage includes:

    - Data Visualization
    - Data Reporting
    - Decision Making

So this are the components Data Science Life Cycle. So at each stage of data science life cycle requires particular speciality and experiences to perform the process involved at each level and makes the data a story telling chapter. The components Data science Life cycle  further combined with software development process and helps the data scientist and software engineers to develop the complete machine-learning based applications powered by Data science.


      
      

Comments

Popular posts from this blog

Rising of the AI in the human centric Development

Rising of the AI in the human centric Development The rising of the artificial intelligence in later 90's have make a rapid impact in field of technology and from 21st century the blooming of a mechanism makes several impact in various industries including software, education, healthcare and many more. As the world becomes increasingly reliant on technology, the role of  artificial intelligence  (AI) in human-centric development has risen to the forefront. From healthcare to transportation to education, AI is being used to improve the lives of people around the globe. Major areas where artificial intelligence AI makes an Impact One area where AI has made significant strides is in the healthcare industry. AI-powered virtual assistants can now assist doctors in diagnosing and treating patients, freeing up valuable time for medical professionals. In addition, AI-powered wearable devices can track a person's health and alert them to any potential issues. The transportation industr...

How to calculate Running Time of an algorithm

                                            Calculate Running Time of an Algorithm The running time of algorithm defines the time required to execute an algorithm on the given set of inputs(n). There are mainly three types of complexity cases defines to measure the running time of an algorithm also known as Asymptotic analysis. 1) Best Case : Best case also called ( Ω) omega  notation which measure the best case scenario of how long an algorithm can possible take to complete given operation on (n) inputs. It's also known as lower bound. 2) Average Case : It represents by ( Θ) theta  notation which measure the average time requires to complete a given operation on set of inputs. It measures between upper and lower bound running time and calculate average running time. 3) Worst Case: It defines the worst case running time of an algorithm. Also represent using ( Ο) Big-o...

When to Use HeatMap plot for Visualization of Data

HeatMap (Matrix) Plot Visualization for the Data: When to Use? Visual representation always helps in simplification either any real world entities or the data. Visualization  provides an pictorial representation so anyone can easily understand about the data and their insights(what they are representing and in which range the value is lying.                                                                                                                                                             Source: HeatMap Now when the data science becomes one of the popular domain in Computer science. It m...