Prepare two development PCs and make sure that both of them can be logged in directly without typing passwords.
Make sure that the file directories and contents of the code to run on both machines are identical.
Start the docker on the first PC.
Where 096e26686644 is the image name of the container started in the first step, which can be viewed using docker ps.
Start the docker on the second PC.
Note that the mapping port number for SSH is changed here. The first PC is 10022 and the second PC is 10023. Make sure these two PCs use different port numbers.
Make sure the code directories inside the docker on both PCs are identical. You can mount the same development directory using -v.
In addition, you can test whether the two docker containers can log into each other without passwords: Use the ifconfig command to get the IP address of the container on one machine (PC1).
Suppose the PC1's IP is 172.17.0.12, go to the container on the other machine (PC2), run ssh -p 22 172.17.0.12 to check whether PC1 can be logged in.
If the login is ok, then you can proceed to the next step.
hostip: IP address of the container on PC1, use ifconfig to check it.
--nnodes 2: 2 is the total number of PCs.
--nproc_per_node 4: 4 means the number of GPUs on each PC (you may need to manually change this number to 4 in configs/classification/mobilenetv1_imagenet.py).
Run this command to see the multi-machine multi-card instance running properly.