Development
Step 1: Setup the Docker environment
Docker is a convenient and easy way to standardize an environment in a very lightweight Linux environment (usually Alpine Linux running in the Docker Engine) that contains all dependencies required to run a program. In this case, the Docker image provided is preconfigured to include Dis.co, discomp (Dis.co’s parallelization multi-processes API), and other image processing and quantitative libraries such as the cv2 (computer vision library), FFmpeg (video processing library), and numpy (scientific computing library).
The purpose of this lesson is to verify that the Docker Environment is setup correctly on the machine.
-
Download the Dis.co Docker image with the following command line.
docker pull iqoqo/discofy:sdc.local
-
Tag the Docker image with the following command line.
docker tag iqoqo/discofy:sdc.local discofy:sdc.local
-
Check if the image is installed in the Docker.
docker images
-
In the project root directory (e.g., /Users/raymondlo84/Documents/sdc2019), execute the following command line to run the script in the Docker environment.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_1/ discofy:sdc.local pytest
The result above show that the Docker had successfully executed the
pytest
script.Tip :Learn more by watching the video
Step 2: Dis.cofy an image
In Step 1, we provided a Docker environment that runs Python code with OpenCV, FFmpeg, and other scientific libraries such as numpy. In this lesson, we will provide an interesting example of how we can overlay eyeglasses onto a face detected in an image using these libraries.
Additionally, we have also added the Dlib library for supporting the face detection and face feature extraction (facial landmarks) based on machine learning. In our code, a pre-trained facial landmark detector and model (i.e., shape_predictor_68.dat
) are added to estimate the location of 68 coordinates that map to facial structures on the detected face.
For further details on how the facial landmark works, please refer to https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/
-
Open
glasses_2.py
in the lessons/lesson_2 directory.Hint :Look at
mesh_overlays
andoverlay_transparent
, these are the main functions that we use for the face detection and meshing of the images. -
Add the solution code to
discofy_image
. -
To run the solution, execute the following command in the project root directory.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_2/ discofy:sdc.local python glasses_2.py <input image path> <output image path>
Replace
<input image path>
and<output image path>
with your own image path. In this example, we have provided an image of Dis.co team in the lessons/lesson_2/team.png directory. -
Run the tester to generate and validate the results.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_2 discofy:sdc.local pytest
-
Open the result images located in lessons/lesson_2/disco-team1.jpg.
Tip :Learn more by watching the video
Step 3: Dis.cofy videos
Now, we have acquired a powerful tool to intelligently process and overlay our Dis.co shade onto a face detected in an image based on machine learning. The next natural step is how we can scale this to a video (i.e., a sequence of images) or even on hundreds of different streams of videos.
To handle video files, the FFmpeg library provided a very simple and easy-to-use interface to extract image frames from a video. Similarly, we can also create a video with image sequences with the same library.
To get started, we will first look into the lessons/lesson_3 folder.
As you can see, we have now provided the glasses_3.py
and a short video clip called samsungfun.mp4 for you to test the work on.
Now, let’s start by looking into the source code itself.
-
Open
lessons/lesson_3/glasses_3.py
. This time the implementation of thediscofy_video
function is left to be completed. To simplify the work, we have already provided the methods supported by OpenCV for extracting frames from a video. -
Replace the code from
line 247
onwards with the following code.The code above basically read each frame from the input video and perform the
mesh_overlays
function discussed in Lesson 2. Then, the result is written out with the FFmpeg function. The reason that we have to perform two streamsin1
andin2
is because we would like to preserve the audio stream from the original video. -
To test the code, run the code on the Docker with the following command with the desired video.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_3/ discofy:sdc.local python glasses_3.py <input video path> <output video path>
The
<input video path>
and<output video path>
should be replaced by the appropriate file path such as lessons/lesson_3/samsungfun.mp4.Tip :Learn more by watching the video
-
Run the test code and generate our test results.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_3 discofy:sdc.local pytest
-
Review the result from lessons/lesson_3/disco-samsungfun1.mp4.
Tip :Learn more by watching the video
Step 4: Dis.co CLI
In the last two lessons, we have created pipelines to process images or videos with some of the state-of-the-art libraries and machine learning algorithms. However, as we may see quickly that these processes take a long while to analyze video footage, especially with high definition images and videos.
Here we introduce the Dis.co command-line interface (CLI). This is one of the easiest way to get familiar with the Dis.co interfaces that allow us to scale and offload the work to a dedicated cloud resource of your choice.
-
Run the Docker interactively. The terminal now is running under the Docker machine.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_4 discofy:sdc.local
-
In the same terminal from the last step, we run the following command. This will return the help menu from the Dis.co CLI with a bit of ASCII art from Dis.co.
disco -h
Now, let’s explore some of the features in the CLI. In the
hello_disco.py
, there is a simple print statement, and in the next step we will run the code on the Dis.co cloud servers. -
Run the
hello_disco.py
with Dis.co CLI.disco add --name hello_job --script hello_disco.py --wait --run
Notice the job_id
[5d7f6f898d9278000a5064e2]
, this unique ID is going to be needed for checking the status or to download the result from the job. -
To download results from the execution, run the following command.
disco view --job <your_job_id> --download
Again, replace
<your_job_id>
with the appropriate job id returned by the process. Lastly, we can examine the results by extracting the downloaded results. -
Extract the zip file with this command.
unzip <path to downloaded>
In our case, the result can be found in the
stdout
file. -
Display the output result.
That’s it. We have just completed our first cloud-based execution without changing a single line of code.
Using Dis.co to perform image processing on the cloud
Step 2 demonstrates a computer vision example of detecting faces, facial landmarks, and overlaying virtual eyeglasses onto a person’s face. In this lesson, we will offload the exact work to the Dis.co cloud and perform such remotely with only a few lines of code changes. More importantly, with the Dis.co’s serverless architecture, you can scale this to 100 or 1000 times without managing the resources or making any code changes on developer’s end.
In this demo, we have created a customized Docker image (with OpenCV, FFmpeg, and Dlib libraries pre-installed) that deploys by each agent running on the server. If desired, we can also setup GPU resources per server and scale even further for your jobs.
The Python script glasses_4.py
provides a full-blown example of how Dis.co handles input and output on the server side.
Particularly, the Dis.co command takes the script glasses_4.py
and the input image team.jpg, upload to the cloud server, and perform the processing remotely. During the executions, any data output to the ‘/local/run-result’ folder on the server side will be saved in our final result.
Upon successful execution, Dis.co packages the results in <job id>-<job name>
folder, and we download the package back on the client side.
-
Run the Docker interactively on the terminal. At the project root directory, we run the following command.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_4.1 discofy:sdc.local
-
Run the Dis.co command to upload and run the job on the Dis.co cloud.
disco add --name disco_image --script glasses_4.py --input team.jpg --wait --run --download
-
Extract the results (with either GUI interface or unzip command in terminal).
unzip <job id>-<job name>/results/<job id>.zip
As usual, we will replace
<job id>
and<job name>
accordingly.The final result should match exactly what we see in Step 2, and that’s it.
Learn more by watching the video
Step 5: Using Dis.co to perform video processing on the cloud
With the Dis.co platform, we can now send jobs across an unlimited number of resources (devices) without the pain of managing them. Basically, compute on-demand and scale as needed. In this part of the lesson, we will demonstrate how to parallelize the video processing pipeline by splitting the work into smaller clips that can be offloaded directly to Dis.co. For example, we can divide a ~4 minute video into 10-second clips, and process each clip in parallel and merge the results in a later step. Again, there is a trade-off on the granularity of the parallelization and it really depends on the use cases. Developers can examine this further by test running the solution with a small scale test.
Now, let’s explore how we can split videos with FFmpeg, and then how we can offload tasks to Dis.co and merge the results back.
-
Run the Docker interactively.
docker run -it -v `pwd`:/home/codelab/ -w /home/codelab/lessons/lesson_5 discofy:sdc.local
-
In the terminal, run the FFmpeg command to split the video into 10 second chucks.
ffmpeg -i samsung10.mp4 -c copy -map 0 -segment_time 00:00:10 -f segment split%03d.mp4
This command generates a set of 10 second clips as split001.mp4, split002.mp4, etc…
Tip :Learn more by watching the video
-
Create Dis.co jobs on each video clip and process them remotely.
disco add --name parallel_video --script glasses_5.py --wait --run --download --input "split*mp4"
Each job ran for approximately 90 seconds on the disco resources. Each input (i.e., the video clips) is automatically distributed across the resources. On the developer side, we have zero lines of code changed to get these batches to run in parallel. Yes, it’s amazing.
Upon completion of these tasks, the Dis.co CLI downloads the results back to the local machine because we have provided the
--download
flag in the last command. -
Run the commands to merge the results back with FFmpeg and some bash scripts.
# unzip results cd *parallel_video/results && for f in `ls *.zip`; do unzip -o $f; done # write the parts to a manifest rm -f manifest.txt; for f in `ls *.mp4`; do echo "file '$f'" >> manifest.txt; done # stitch back the video ffmpeg -f concat -safe 0 -i manifest.txt -c copy discofy_toy.mp4 # move back the result to the lesson dir mv discofy_toy.mp4 ../.. # go back home cd ../..
Now, we have it, the final result can be found in lessons/lesson_5/disco-samsung.mp4
Learn more by watching the video
Bonus: discomp
Dis.co also supports multiprocessing API natively with Python. That means if you have prior experiences working with the Python MP, we can easily port the code to support MP with a single line of code changing.
This bonus lesson will demonstrate an analytics example of processing YouTube videos with the discomp APIs that work natively with your Python code.
-
Learn the code by reading
you_tube_face_detection.py
located in lessons/lessons_6.The function
handle_url
is the entry point to analyze a video from a YouTube video URL. In the initial example, we provide a single-thread example that process these videos one at a time.However, there are various ways we can parallelize it in Python. Particularly, we will utilize the Pool object from the multiprocessing Python module.
Please refer to https://docs.python.org/3/library/multiprocessing.html for additional documentation.
-
Run each MP version with Dis.co. There are four versions of the
main
function:-
main_single_thread
- a single-threaded main -
main_mp
- a multi-process version running locally on your computer -
main_discomp
- a multi-process version where each process handles a single video remotely on Dis.co cloud -
main_discomp_mp
- a multi-process version where each process handles multiple videos (in parallel) remotely on Dis.co cloud
To run each version, we can simply comment and uncomment the relevant code in the main function.
Show us your results and let us know how much performance gain you have squeezed out from Dis.co.
-
You're done!
Congratulations! You have successfully achieved the goal of this Code Lab activity. Now, you can run compute-intensive jobs and parallelize them using Dis.co by yourself! But, if you're having trouble, you may check out the link below.