mlusecase(dvc) – (mlops)

In this project we’ll see how we can use .dvc (data version control) in MLOP’s

wokflow –

workflow

create fresh conda env

        $ conda create -n dvc python=3.6 -y
        $ conda activate dvc 
        $ pip install -r requirements.txt 

basic commands (commonly used command )

        # git commands 
        $ git init
        $ git remote add origin <your repo link>
        $ git branch -M main
        $ touch README.md
        $ touch .gitignore
        $ git add .
        $ git commit -m 'basic setup added'
        $ git push -u origin main 

        #  .dvc command
        $ pip install dvc 
        $ dvc init
        $ dvc repro 
        $ dvc dag

dirs setup ( you can skip it )

        # creating the packages  
        $ mkdir src      # source dir i
        $ mkdir utils               
        $ mkdir config     # dir to store config files 
        $ touch src/__init__.py          # for packaging 
        $ touch src/utils/__init__.py
        $ touch params.yaml 
        $ touch dvc.yaml 
        $ touch config/config.yaml    # yaml file for configurations

creating the local dir as packages

  • create src dir as package
  • write setup.py file for that

                # setup.py 

                from setuptools import find_packages, setup , find_packages 

                with open("README.md", "r", encoding="utf-8") as f:
                long_description = f.read()

                setup(
                name="src",
                version="0.0.1",
                author="shubhamchau222",
                description="A small package for dvc ml pipeline demo",
                long_description=long_description,
                long_description_content_type="text/markdown",
                url="https://github.com/shubhamchau222/dvc-MLusecase-mlops",
                author_email="[email protected]",
                packages=["src"],
                python_requires=">=3.6",
                install_requires=[
                        'dvc',
                        'pandas',
                        'scikit-learn'
                ]
                )

    # to make src as a package
    # write package req in requirements.txt (install the local packages)
    # open requirement.txt  and write 
      $ -e . 

    here '.' means all local packages 

        $ pip install -r requirements.txt   # ( to install the local packages)
  • until tou install ‘local packages’ you’ll get error ‘src module’ not found

GitHub

https://github.com/shubhamchau222/dvc-MLusecase-mlops