English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 62567/95223 (66%)
造访人次 : 2523830      在线人数 : 35
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻

    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/123315

    题名: Swin-JDE: Joint Detection and Embedding Multi-Object Tracking in Crowded Scenes Based on Swin-Transformer
    作者: Tsai, Chi-Yi;Shen, Guan-Yu;Nisar, Humaira
    日期: 2023-03
    上传时间: 2023-04-28 17:36:47 (UTC+8)
    摘要: Multi-object tracking (MOT) is a highly valued and challenging research topic in computer vision. To achieve more robust tracking performance, recently published MOT methods tend to use anchor-free object detectors, which have the advantage of dealing with the identity ambiguity problem encountered by anchor-based methods in learning appearance features. However, in practical applications, it is found that the detection accuracy of the anchor-free object detector based on classical convolutional neural networks in crowded scenes will be significantly reduced. In order to have better detection and tracking performance in crowded scenes, this paper proposes an anchor-free joint detection and embedding (JDE) MOT method based on Transformer architecture, called Swin-JDE. The proposed method includes a novel Patch-Expanding module, which can improve the spatial information of feature maps by up-sampling processing through neural network learning and Einops Notation-based rearrangement to enhance the detection and tracking performance of the MOT model. In terms of training method, we propose a two-step training method that trains the detection branch separately from the appearance branch to enhance the detection robustness of anchor-free predictors. Furthermore, during the training process, we also propose an examination method to remove occluded targets from the training dataset to improve the accuracy of the appearance embedding layer. In terms of data association, we propose a new post-processing method, which simultaneously considers the three factors of detection confidence, appearance embedding distance, intersection-over-union (IoU) distance to match each tracklet and the detection information to improve the tracking robustness of the MOT model. Experimental results show that the proposed method achieves 70.38% multiple object tracking accuracy (MOTA) and 69.53% identification F1-score (IDF1) results in the MOT20 benchmark dataset, and the identification switch (ID Switch) is reduced to 2026. Compared with FairMOT, the proposed method improves MOTA and IDF1 by 8.58% and 2.23%, respectively.
    關聯: Engineering Applications of Artificial Intelligence 119(2), p.1-16
    DOI: 10.1016/j.engappai.2022.105770
    显示于类别:[電機工程學系暨研究所] 期刊論文


    档案 描述 大小格式浏览次数



    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回馈