Development

Step 1: Setup the Docker environment

Docker is a convenient and easy way to standardize an environment in a very lightweight Linux environment (usually Alpine Linux running in the Docker Engine) that contains all dependencies required to run a program. In this case, the Docker image provided is preconfigured to include Dis.co, discomp (Dis.co’s parallelization multi-processes API), and other image processing and quantitative libraries such as the cv2 (computer vision library), FFmpeg (video processing library), and numpy (scientific computing library).

The purpose of this lesson is to verify that the Docker Environment is setup correctly on the machine.

  1. Download the Dis.co Docker image with the following command line.

    docker pull iqoqo/discofy:sdc.local
    

  2. Tag the Docker image with the following command line.

    docker tag iqoqo/discofy:sdc.local discofy:sdc.local
    
  3. Check if the image is installed in the Docker.

    docker images
    

  4. In the project root directory (e.g., /Users/raymondlo84/Documents/sdc2019), execute the following command line to run the script in the Docker environment.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_1/ discofy:sdc.local pytest
    

    The result above show that the Docker had successfully executed the pytest script.


Step 2: Dis.cofy an image

In Step 1, we provided a Docker environment that runs Python code with OpenCV, FFmpeg, and other scientific libraries such as numpy. In this lesson, we will provide an interesting example of how we can overlay eyeglasses onto a face detected in an image using these libraries.

Additionally, we have also added the Dlib library for supporting the face detection and face feature extraction (facial landmarks) based on machine learning. In our code, a pre-trained facial landmark detector and model (i.e., shape_predictor_68.dat) are added to estimate the location of 68 coordinates that map to facial structures on the detected face.

For further details on how the facial landmark works, please refer to https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/

  1. Open glasses_2.py in the lessons/lesson_2 directory.

  2. Add the solution code to discofy_image.

  3. To run the solution, execute the following command in the project root directory.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_2/
    discofy:sdc.local python glasses_2.py <input image path> <output image path>
            
    

    Replace <input image path> and <output image path> with your own image path. In this example, we have provided an image of Dis.co team in the lessons/lesson_2/team.png directory.

  4. Run the tester to generate and validate the results.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_2 discofy:sdc.local pytest
    
  5. Open the result images located in lessons/lesson_2/disco-team1.jpg.


Step 3: Dis.cofy videos

Now, we have acquired a powerful tool to intelligently process and overlay our Dis.co shade onto a face detected in an image based on machine learning. The next natural step is how we can scale this to a video (i.e., a sequence of images) or even on hundreds of different streams of videos.

To handle video files, the FFmpeg library provided a very simple and easy-to-use interface to extract image frames from a video. Similarly, we can also create a video with image sequences with the same library.

To get started, we will first look into the lessons/lesson_3 folder.

As you can see, we have now provided the glasses_3.py and a short video clip called samsungfun.mp4 for you to test the work on.

Now, let’s start by looking into the source code itself.

  1. Open lessons/lesson_3/glasses_3.py. This time the implementation of the discofy_video function is left to be completed. To simplify the work, we have already provided the methods supported by OpenCV for extracting frames from a video.

  2. Replace the code from line 247 onwards with the following code.

    The code above basically read each frame from the input video and perform the mesh_overlays function discussed in Lesson 2. Then, the result is written out with the FFmpeg function. The reason that we have to perform two streams in1 and in2 is because we would like to preserve the audio stream from the original video.

  3. To test the code, run the code on the Docker with the following command with the desired video.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_3/
    discofy:sdc.local python glasses_3.py <input video path> <output video path>
    

    The <input video path> and <output video path> should be replaced by the appropriate file path such as lessons/lesson_3/samsungfun.mp4.


  4. Run the test code and generate our test results.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_3 discofy:sdc.local pytest
    
  5. Review the result from lessons/lesson_3/disco-samsungfun1.mp4.


Step 4: Dis.co CLI

In the last two lessons, we have created pipelines to process images or videos with some of the state-of-the-art libraries and machine learning algorithms. However, as we may see quickly that these processes take a long while to analyze video footage, especially with high definition images and videos.

Here we introduce the Dis.co command-line interface (CLI). This is one of the easiest way to get familiar with the Dis.co interfaces that allow us to scale and offload the work to a dedicated cloud resource of your choice.

  1. Run the Docker interactively. The terminal now is running under the Docker machine.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_4 discofy:sdc.local
    

  2. In the same terminal from the last step, we run the following command. This will return the help menu from the Dis.co CLI with a bit of ASCII art from Dis.co.

    disco -h
    

    Now, let’s explore some of the features in the CLI. In the hello_disco.py, there is a simple print statement, and in the next step we will run the code on the Dis.co cloud servers.

  3. Run the hello_disco.py with Dis.co CLI.

    disco add --name hello_job --script hello_disco.py --wait --run
    

    Notice the job_id [5d7f6f898d9278000a5064e2], this unique ID is going to be needed for checking the status or to download the result from the job.

  4. To download results from the execution, run the following command.

    disco view --job <your_job_id> --download
    

    Again, replace <your_job_id> with the appropriate job id returned by the process. Lastly, we can examine the results by extracting the downloaded results.

  5. Extract the zip file with this command.

    unzip <path to downloaded>
    

    In our case, the result can be found in the stdout file.

  6. Display the output result.

    That’s it. We have just completed our first cloud-based execution without changing a single line of code.

Using Dis.co to perform image processing on the cloud

Step 2 demonstrates a computer vision example of detecting faces, facial landmarks, and overlaying virtual eyeglasses onto a person’s face. In this lesson, we will offload the exact work to the Dis.co cloud and perform such remotely with only a few lines of code changes. More importantly, with the Dis.co’s serverless architecture, you can scale this to 100 or 1000 times without managing the resources or making any code changes on developer’s end.

In this demo, we have created a customized Docker image (with OpenCV, FFmpeg, and Dlib libraries pre-installed) that deploys by each agent running on the server. If desired, we can also setup GPU resources per server and scale even further for your jobs.

The Python script glasses_4.py provides a full-blown example of how Dis.co handles input and output on the server side.

Particularly, the Dis.co command takes the script glasses_4.py and the input image team.jpg, upload to the cloud server, and perform the processing remotely. During the executions, any data output to the ‘/local/run-result’ folder on the server side will be saved in our final result.

Upon successful execution, Dis.co packages the results in <job id>-<job name> folder, and we download the package back on the client side.

  1. Run the Docker interactively on the terminal. At the project root directory, we run the following command.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_4.1 discofy:sdc.local
    
  2. Run the Dis.co command to upload and run the job on the Dis.co cloud.

    disco add --name disco_image --script glasses_4.py --input team.jpg --wait --run --download
    

  3. Extract the results (with either GUI interface or unzip command in terminal).

    unzip <job id>-<job name>/results/<job id>.zip
    

    As usual, we will replace <job id> and <job name> accordingly.

    The final result should match exactly what we see in Step 2, and that’s it.


Step 5: Using Dis.co to perform video processing on the cloud

With the Dis.co platform, we can now send jobs across an unlimited number of resources (devices) without the pain of managing them. Basically, compute on-demand and scale as needed. In this part of the lesson, we will demonstrate how to parallelize the video processing pipeline by splitting the work into smaller clips that can be offloaded directly to Dis.co. For example, we can divide a ~4 minute video into 10-second clips, and process each clip in parallel and merge the results in a later step. Again, there is a trade-off on the granularity of the parallelization and it really depends on the use cases. Developers can examine this further by test running the solution with a small scale test.

Now, let’s explore how we can split videos with FFmpeg, and then how we can offload tasks to Dis.co and merge the results back.

  1. Run the Docker interactively.

    docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_5 discofy:sdc.local
    
  2. In the terminal, run the FFmpeg command to split the video into 10 second chucks.

    ffmpeg -i samsung10.mp4 -c copy -map 0 -segment_time 00:00:10 -f segment split%03d.mp4
    

    This command generates a set of 10 second clips as split001.mp4, split002.mp4, etc…


  3. Create Dis.co jobs on each video clip and process them remotely.

    disco add --name parallel_video --script glasses_5.py --wait --run --download --input "split*mp4"
    

    Each job ran for approximately 90 seconds on the disco resources. Each input (i.e., the video clips) is automatically distributed across the resources. On the developer side, we have zero lines of code changed to get these batches to run in parallel. Yes, it’s amazing.

    Upon completion of these tasks, the Dis.co CLI downloads the results back to the local machine because we have provided the --download flag in the last command.

  4. Run the commands to merge the results back with FFmpeg and some bash scripts.

    # unzip results
    cd *parallel_video/results &&
    for f in `ls *.zip`; do unzip -o $f; done
            
    # write the parts to a manifest rm -f manifest.txt;
    for f in `ls *.mp4`; do echo "file '$f'" >> manifest.txt; done
            
    # stitch back the video
    ffmpeg -f concat -safe 0 -i manifest.txt -c copy discofy_toy.mp4
            
    # move back the result to the lesson dir 
    mv discofy_toy.mp4 ../..
            
    # go back home 
    cd ../..
    

    Now, we have it, the final result can be found in lessons/lesson_5/disco-samsung.mp4


Bonus: discomp

Dis.co also supports multiprocessing API natively with Python. That means if you have prior experiences working with the Python MP, we can easily port the code to support MP with a single line of code changing.

This bonus lesson will demonstrate an analytics example of processing YouTube videos with the discomp APIs that work natively with your Python code.

  1. Learn the code by reading you_tube_face_detection.py located in lessons/lessons_6.

    The function handle_url is the entry point to analyze a video from a YouTube video URL. In the initial example, we provide a single-thread example that process these videos one at a time.

    However, there are various ways we can parallelize it in Python. Particularly, we will utilize the Pool object from the multiprocessing Python module.

    Please refer to https://docs.python.org/3/library/multiprocessing.html for additional documentation.

  2. Run each MP version with Dis.co. There are four versions of the main function:

    • main_single_thread - a single-threaded main

    • main_mp - a multi-process version running locally on your computer

    • main_discomp - a multi-process version where each process handles a single video remotely on Dis.co cloud

    • main_discomp_mp - a multi-process version where each process handles multiple videos (in parallel) remotely on Dis.co cloud

    To run each version, we can simply comment and uncomment the relevant code in the main function.

    Show us your results and let us know how much performance gain you have squeezed out from Dis.co.

You're done!

Congratulations! You have successfully achieved the goal of this Code Lab activity. Now, you can run compute-intensive jobs and parallelize them using Dis.co by yourself! But, if you're having trouble, you may check out the link below.

Dis.co Complete Code163.39 MB