Skip to main content

Components of Data Science Life Cycle



                                           Components of Data Science Life Cycle


Data Science continues to evolve as the one of the most promising and demanding career of 21st century.
The insights drawn from the data is very much useful and profitable for the businesses when processed with intelligent algorithms to find pattern and insights from it.

The complete Data science follows a life cycle pattern which defines the steps of each stage of data and apply them to make it processed in more informative and easier way. The components of Data Science life cycle consist of five stages. Each stage have different tasks which perform on data during complete life-cycle span of Data science.

                             
                          
                                           Fig:- Components of Data Science Life Cycle

 
  The 5 components of Data Science Life Cycle are:-

  1. Data Capturing

Capture of Data from different  sources such that we derive some result from it after pre-processing the data.(including entry and extraction)
     
      The task performed during complete span of Data Capturing is:

     -  Data Acquisition
     -  Data Entry
     -  Data Extraction

   2. Data Maintain

Maintaining the data is often required when we handle with varieties of data and even the dataset  provided for analysis is staged in different format. Maintaining the data and makes it available for process and analysis is done at this stage. Pre-processing is done just after which includes data cleaning, removal or replacement of Nan values with the average value of complete column (if necessary), outliers removal, etc.        

       The Data Maintain stage of Data science life-cycle includes:-
      
     - Data Cleansing
     - Data Staging
     - Data Warehousing.
     - Data Processing


  3. Data Processing

Data may or may not be in proper format (i.e. structured data). So we have apply various techniques to processing  the data such that it becomes prepared for analysis. Processing includes data modelling,data summarization (complete summary from the format structure data),data clustering and classification in various groups.

     The data processing stage includes:

     - Data Modelling
     - Data Classification
     - Data Summarization

  4. Data Analyze

Analyze the data and finding the key insights is one of the challenging more decisive process. To analyze the data various various statistics test and algorithms performed by analyst to derived thepattern and insights from the data and the do storytelling about the analysis find from it.

       The task performed during Data Analyze stage of  data science life-cycle includes:

      - Exploratory analyze
      - Predictive Analysis
      - Regression
      - Qualitative Analysis

  5. Data Communication

Data communication plays an key importance in data science life cycle. After analyze the data the main thing is to represent and visualize the insights such that everyone understand about what the data tells (insights, pattern) and its visual representation. After that decision making performed accordingly

      The Data Communication stage includes:

    - Data Visualization
    - Data Reporting
    - Decision Making

So this are the components Data Science Life Cycle. So at each stage of data science life cycle requires particular speciality and experiences to perform the process involved at each level and makes the data a story telling chapter. The components Data science Life cycle  further combined with software development process and helps the data scientist and software engineers to develop the complete machine-learning based applications powered by Data science.


      
      

Comments

Popular posts from this blog

Rising of the AI in the human centric Development

Rising of the AI in the human centric Development The rising of the artificial intelligence in later 90's have make a rapid impact in field of technology and from 21st century the blooming of a mechanism makes several impact in various industries including software, education, healthcare and many more. As the world becomes increasingly reliant on technology, the role of  artificial intelligence  (AI) in human-centric development has risen to the forefront. From healthcare to transportation to education, AI is being used to improve the lives of people around the globe. Major areas where artificial intelligence AI makes an Impact One area where AI has made significant strides is in the healthcare industry. AI-powered virtual assistants can now assist doctors in diagnosing and treating patients, freeing up valuable time for medical professionals. In addition, AI-powered wearable devices can track a person's health and alert them to any potential issues. The transportation industr...

The Magic of Data Visualization using Matplotlib

      The Magic of Data Visualization Using Matplotlib Matplotlib is a multiplatform data visualization library built on Numpy arrays and designed to work with broader Scipy Stack. Matplotlib was developed by John Hunter in 2003 with version 0.1. This project is supported by Space Telescopic institute for complete development and extension for better capabilities. Matplotlib library enhances the plotting and visualization technique in python. As using the matplotlib we can create various plots, histogram, maps, chart and many more plotting. Visualization of Data     Important features of Matplotlib   It play and operates well with many operating systems and graphics back-ends.   Matplotlib have strength of running cross platform graphics engine smoothly and reliable to different types of graphics system.   There are various API’s and wrappers make this library to useful to dive into Matplotlib’s syntax to adjust the final plot output. Customizatio...