PyTorch Classifying an image

The essential thing to do with an in-depth learning framework is to classify an image with a pre-trained model. This article works out of the box with PyTorch.

1. Head over to pytorch.org for instructions on how to install PyTorch on your machine.
2. Install other dependencies, including a specific commit of torch vision (since things are changing quickly).

3. Import packages and hardcode URLs.

The first two imports are for reading labels and an image from the internet. The Image class comes from a package called pillow and is the format for passing images into torch vision. LABELS_URL is a JSON file that maps label indices to English descriptions of the ImageNet classes and IMG_URL can be any image you like. If it’s in one of the 1,000 ImageNet classes, this code should correctly classify it.

4. Initialize the model.

This will download the weights for the SqueezeNet model.

5. Define the preprocessing transform.

The specific set of steps in the image processing transform come from the pytorch examples repo here and here. Without these, the classifier will not work correctly.

6. Download the image and create a pillow Image.

This is a quick trick for reading images from a URL. You can also read them from disk with Image.open(“/path/to/image.jpg”). One cool thing about pillow images is that if you execute a code cell with the object in jupyter, it will display the image for you.

7. Preprocess the image.

First, we apply the preprocessing transforms from above; then we use .unsqueeze_(0) to add a dimension for the batch. Any method that ends with an underscore happens in place.

8. Run a forward pass with the neural network.

The input to the network needs to be an autographed Variable. We run the forward pass by calling the squeeze model. NOTE: this does not apply the softmax activation function.

9. Download the labels.

The requests package will parse JSON for us and return a dictionary. But it’s nice for the keys to be integers since we’re looking for the index of the maximum element in fc_out. After this step, labels will look like this:

10. Print the label!

Notice, the fc_out variable has a .data attribute. This is a torch Tensor, which has a .numpy() method, which gives us a numpy array. We can call .argmax() on the numpy array to get the index of the maximum element.  We find the value with that key from labels, and we get our class label.

Code Completed:

PyTorch Classifying an image
4.8 (95%) 4 votes