I’ve decided to call my segmentation network HooVorNet, and I’ll share some its key unique features. HooVorNet employs feature generator blocks that split filters among several calculation paths and then recombine with concatenation. HooVorNet’s feature generator blocks are distinct from other feature concentrators that use a similar method such as Google’s Inception Module because HooVorNet uses shallow and deep auto-encoders in each of the split paths, and incorporates a skip connection by concatenation.
Using the HooVorNet Feature Generator block, the network simultaneously gains the benefits of normal convolutional feature examination and the incredibly deep inference of chained auto-encoders. This block encodes shallow swept local features (1×1,3×3, and 5×5) and deep strided features (16×16).
The HooVorNet skip connections are key to fast training while helping to preventing overtraining. Normally, when auto-encoders (sometimes called hourglass networks) are embedded in a network, there is a strong tendency for overtraining. Basically, the embedded auto-encoder learns to assign each of the input images a unique identifier, and then the rest of the model builds the desired output from that. This leads to super accurate predictions on images in the training set, but very poor prediction on images outside of the training set. So, HooVorNet incorporates skip connections which re-inject the original data (which is ideally modified by random dropout or noise during training) to help make it less optimal for the training to result in the network using the hourglass networks to just identify each of the training images. Instead, the embedded networks become what is desired, which is strong feature generators.
The network employs special and deep feature generator blocks just ahead of the output that are a more advanced version of the Feature Generator Blocks that provide the same benefits while keeping parameter and FLOP counts of the network at a minimum. For this particular embodiment of HooVorNet, the full model has only 9 million parameters and requires 4 billion floating point operations.
In consideration of computational efficiency, and the overall cost of security, I’m providing in the zip file below a tflite model which has been trained for person segmentation as well as some convenience code for building a keras HooVorNet model to train for alternate purposes. I grant license to the use of the HooVorNet network architecture and any derivatives thereof, and the person segmentation weights and any derivatives thereof for any use which reasonably causes no harm to people, provided that the device or software utilizing the HooVorNet architecture or derivatives thereof prominently displays one of the following statements during use: “Powered by Hints Of Ozone HooVorNet.” or “Derived from Hints Of Ozone HooVorNet.” No warranty is provided for the quality of the aforementioned, and by download or use thereof, you accept all liability for harms or damages arising therefrom. Any use which results in harms or damages shall be considered to be unlicensed use.
Original Post HooVorNet:. The Hints of Ozone Vortal Network