|
Answer» Response: CRISP- DM stands for "Cross Industry Standard Process for Data Mining". This is a standard methodology used for end-to-end Data Science project or program execution. It follows various stages which involve different type of activities or tasks that are carried out during the program execution. - Business understanding – typical tasks include the FOLLOWING: determining business objective or goals of what needs to be accomplished, assessing the situation, determining data mining goals and trying to convert business problem into data problem, defining project plan with various tasks etc.
- Data understanding – typical tasks include the following: collecting initial data, describing data, exploring data, verifying data quality etc. This helps in preparing exploratory data analysis and acts as an interim STEP to show what patterns, variations exist in the data and can be shown to respective stakeholders.
- Data preparation – typical tasks include selecting specific data needed for modelling purposes, cleaning data, constructing data, integrating data and formatting data as needed per requirement and scope. Feature engineering is performed as part of this process step and prepared as an input to the next phase.
- Modelling or Model development – typical tasks include selecting modelling techniques, generating test design, building model, assessing model etc. This phase is used to build models using various algorithms or methods.
- Model evaluation – typical tasks include evaluating results, reviewing process, determining next STEPS etc. Various metrics are being used to evaluate multiple models or multiple EXPERIMENTS that were created as part of the previous step or phase.
- Deployment – typical tasks include plan deployment, plan MONITORING & maintenance, presenting product final report & reviewing the project etc. This refers to the operationalization phase of an existing model or solution which was created and evaluated as the best experiment to be elevated to the production environment for usage and consumption purposes.
These are iterative. Below diagram depicts a view of the process methodology. Image Ref
|