Questions about

Advanced Applications - Optimize Model with Annotation

Model Optimization --- Data Annotation

Common labeling errors

Data is the foundation of the model. In our experience, when the accuracy of the model is low, there is a 90% probability that it is because of a problem with the annotation.

The following is a list of common annotation problems, We hope readers can combine their own project data situation, improve the annotation method.

Mistake 1: The defects of different visual styles are unified into a "defect" label

Lack of focus on the type of defect, only expected to identify the defect. For manual convenience, a variety of different types of defects are marked with a label "defects". This behavior will succeed in "confusing" the model, thus failing to recognize the defect.

The right way:

Regardless of whether you care about the type of defect, you also need to classify the defect according to visual differentiation, such as scratches, pits, etc., with different labels.

The following figure takes carton defects detection as an example. Different labels are classified according to the defects of different visual styles. Classification of different labels by different visual styles can significantly improve the model recognition effect.

Mistake 2: There are many defects in a picture, only a part of the defects are marked, and the others are too lazy to mark

Both the labeled and unlabeled "background" parts of a picture are fed into the model for training. Since a part of it is originally defective, but it is not marked, it is equivalent to telling the model "this is not a defect", so that the model "makes a mistake" : why do some defects look similar, and some are not?

The right way:

Defects in a drawing should be marked to the limit. If there are too many defects, and you don't want to mark, then use the platform's annotation page "Sheilding area" function, in this way , you can cover the parts that you don't want to mark.

In general, if the target style is consistent, you can try it out after labeling a few dozen targets.

Below are the results of steel pipe are marked, to follow the principle of "Everything that should be labeled should be labeled"

Mistake 3: "Are these defects? It's a little bit ambiguous, but let the model figure it out"

When a human is not clear about the labeling of a defect, it will have different "scales" when labeling, so that the model will be successfully confused during training, and it will be as unclear as the human.

The right way:

It's important for humans to determine a metric for defect criteria, especially if multiple people are labeling the same dataset. Models are trained consistently only if they are labeled at a uniform scale.

Mistake 4: Mark the defect with a box larger than the target

The big box includes the defect target and many background parts, which also makes the model confuse the target and background when using the anchor box to locate the defect, thus reducing the recognition accuracy of the model.

Here is an example of an incorrect annotation. As the "puncture hole" and "scratch hole" area were significantly greater than the actual defect area. Such labeling makes the model poorly trained.

The right way:

Should use the right box for target, appropriate to the size of the rectangular box (for target) or poly (in pixels) segmentation is to defect target select the annotations.

The correct labeling sample should be as shown in the figure below:

How to Optimize Annotations?

When the model is tested on the platform, sometimes it is seen that some target labels are not recognized correctly in the "tested" image. This is usually due to the marked sample in the picture is not enough rich, the style of a new target in the test pattern. This is rich annotation data type, a great opportunity to promote model applicable ability.

You can use the "Test result transfer" function to convert the test image to the annotation image. The specific process is as follows.

  1. Check the tested images and manually re-label the labels that have been identified as problematic or missed. Select the labeling tool. The first time you click the image to try labeling, you will be prompted. Click "OK".

        2. After relabeling, click "Save".

Still have a question?

Each section should be concise, user-friendly, and direct users to additional resources or documentation when necessary.