1 数据集介绍
1.1 前言
最近准备找一个鱼类数据集做目标检测练练手,但苦于这种特定领域(而且偏)的数据集特别少,之前找了一个Labeled Fish In The Wild的数据集,这个数据集有坐标框的标注,但是只是一个统一类别(fish)的检测,没有物种级别的分类。所以又找了一个fish4knowledge数据集,这个数据集有23种类别标注,但是没有标注框的坐标,能用上的仅有两种图片而已,一种原图,一种对应的掩膜图片。本来打算自己用LabelImage工具标注的(图片又多我又懒::QQ:qq-150:: ),恰好最近阅读了几篇关于语义分割的目标检测论文,一看这种分割结果的图就有了获取坐标的思路,所以准备用这个来批量生成坐标以及构建所需的数据集格式。
1.2 fish4knowledge数据集
数据集中所有的鱼的图片都是从水下拍摄的视频中截取的,包含23个种类,总共27370张图像,但是像素很低。下图展示了代表性鱼种名称和其检测到的数量。数据非常不平衡,其中最常见的鱼图片大约是最不常见鱼的1000倍。所有鱼的种类都有海洋生物学家手动标记。
数据集中包含两类图像(fish image和其对应的掩膜mask images,没有位置信息),每类中各有23个子文件夹对应23个鱼种类。
2 VOC格式数据集制作
2.1 想法验证
在mask图片上框注目标,并返回坐标信息,文件名bbox.py。
运行结果如图,可知该方法可代替人工进行批量标注:
(114, 116, 3)
[[31 29]
[87 85]]
那还等什么,利用这个想法批量标注并生成对应数据格式啊!
2.2 制作VOC格式的xml文件
文件名list2xml.py。
2.3 批量生成csv表格和xml
文件名main.py。
结果如下:
2.4 匹配测试
上面批量生成好了,怎么测试对应的图片和标注文件是否匹配,是否全部都有呢?下面进行测试,文件名isPatch.py。
import os
import xml.etree.ElementTree as ET
import cv2
# print('xmlnum:', len([lists for lists in os.listdir('../VOC2007/Annotations') if
os.path.isfile(os.path.join('../VOC2007/Annotations', lists))]))
# print('picturenum:', len([lists for lists in os.listdir('../VOC2007/JPEGImages') if
os.path.isfile(os.path.join('../VOC2007/JPEGImages', lists))]))
# 看文件名是否匹配,前面是替换了mask的图片名得到的列表(实际上是测试mask的图片命名规则和原图是否一致,避免到时候训练找不到原图)
def patch_rate(xmlDirPath, pictureDirPath):
xmlList = []
pngList = []
count = 0
for item in os.listdir(xmlDirPath):
xmlList.append(item)
for item in os.listdir(pictureDirPath):
pngList.append(item)
for i in range(27370):
if xmlList[i].split('.')[0] == pngList[i].split('.')[0]:
count += 1
return count, len(xmlList), len(pngList), count / len(xmlList)
def visualization_image(image_name, xml_file_name):
tree = ET.parse(xml_file_name)
root = tree.getroot()
object_lists = []
for child in root:
if child.tag == "folder":
print(child.tag, child.text)
elif child.tag == "filename":
print(child.tag, child.text)
elif child.tag == "size": # 解析size
for size_child in child:
if size_child.tag == "width":
print(size_child.tag, size_child.text)
elif size_child.tag == "height":
print(size_child.tag, size_child.text)
elif size_child.tag == "depth":
print(size_child.tag, size_child.text)
elif child.tag == "object": # 解析object
singleObject = {}
for object_child in child:
if object_child.tag == "name":
# print(object_child.tag,object_child.text)
singleObject["name"] = object_child.text
elif object_child.tag == "bndbox":
for bndbox_child in object_child:
if bndbox_child.tag == "xmin":
singleObject["xmin"] = bndbox_child.text
# print(bndbox_child.tag, bndbox_child.text)
elif bndbox_child.tag == "ymin":
# print(bndbox_child.tag, bndbox_child.text)
singleObject["ymin"] = bndbox_child.text
elif bndbox_child.tag == "xmax":
singleObject["xmax"] = bndbox_child.text
elif bndbox_child.tag == "ymax":
singleObject["ymax"] = bndbox_child.text
object_length = len(singleObject)
if object_length > 0:
object_lists.append(singleObject)
img = cv2.imread(image_name)
for object_coordinate in object_lists:
bboxes_draw_on_img(img, object_coordinate)
cv2.imshow("capture", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
def bboxes_draw_on_img(img, bbox, color=[255, 0, 0], thickness=2):
# Draw bounding box...
print(bbox)
p1 = (int(float(bbox["xmin"])), int(float(bbox["ymin"])))
p2 = (int(float(bbox["xmax"])), int(float(bbox["ymax"])))
cv2.rectangle(img, p1, p2, color, thickness)
if __name__ == '__main__':
# xml的命名相当于是mask中的文件名,JPEG里面是原图复制过来的的(相当于原图没有动过)
print(patch_rate('../VOC2007/Annotations', '../VOC2007/JPEGImages'))
visualization_image("../VOC2007/JPEGImages/fish_000000009598_05468.png",
"../VOC2007/Annotations/fish_000000009598_05468.xml")
运行结果:
xmlnum: 27370
picturenum: 27370
(27370, 27370, 27370, 1.0)
folder JPEGImages
filename fish_000000009598_05468.png
width 127
height 121
depth 3
{'name': 'fish01', 'xmin': '33', 'ymin': '32', 'xmax': '95', 'ymax': '89'}
看是不是比人工标注要好?而且省力。
3 数据来源
http://groups.inf.ed.ac.uk/f4k/GROUNDTRUTH/RECOG/
[1]. B. J. Boom, P. X. Huang, C. Spampinato, S. Palazzo, J. He, C. Beyan, E. Beauxis-Aussalet, J. van Ossenbruggen, G. Nadarajan, J. Y. Chen-Burger, D. Giordano, L. Hardman, F.-P. Lin, R. B. Fisher, "Long-term underwater camera surveillance for monitoring and analysis of fish populations", Proc. Int. Workshop on Visual observation and Analysis of Animal and Insect Behavior (VAIB), in conjunction with ICPR 2012, Tsukuba, Japan, 2012.
[2]. B. J. Boom, P. X. Huang, J. He, R. B. Fisher, "Supporting Ground-Truth annotation of image datasets using clustering", 21st Int. Conf. on Pattern Recognition (ICPR), 2012.
4 comments
牛啊
厉害
!!!
牛啊