Jan 10, 2025: The website is coming. Call for papers.
Today, ubiquitous multimedia sensors and large-scale computing infrastructures are producing at a rapid velocity of 3D multi-modality data, such as 3D point cloud acquired with LIDAR sensors, RGB-D videos recorded by Kinect cameras, meshes of varying topology, and volumetric data. 3D multimedia combines different content forms such as text, audio, images, and video with 3D information, which can perceive the world better since the real world is 3-dimensional instead of 2-dimensional. For example, the robots can manipulate objects successfully by recognizing the object via RGB frames and perceiving the object size via point cloud. Researchers have strived to push the limits of 3D multimedia search and generation in various applications, such as autonomous driving, robotic visual navigation, smart industrial manufacturing, logistics distribution, and logistics picking. The 3D multimedia (e.g., the videos and point cloud) can also help the agents to grasp, move and place the packages automatically in logistics picking systems.
Therefore, 3D multimedia analytics is one of the fundamental problems in multimedia understanding. Different from 3D vision, 3D multimedia analytics mainly concentrate on fusing the 3D content with other media. It is a very challenging problem that involves multiple tasks such as human 3D mesh recovery and analysis, 3D shapes and scenes generation from real-world data, 3D virtual talking head, 3D multimedia classification and retrieval, 3D semantic segmentation, 3D object detection and tracking, 3D multimedia scene understanding, and so on. Therefore, the purpose of this workshop is to: 1) bring together the state-of-the-art research on 3D multimedia analysis; 2) call for a coordinated effort to understand the opportunities and challenges emerging in 3D multimedia analysis; 3) identify key tasks and evaluate the state-of-the-art methods; 4) showcase innovative methodologies and ideas; 5) introduce interesting real-world 3D multimedia analysis systems or applications; and 6) propose new real-world or simulated datasets and discuss future directions. We solicit original contributions in all fields of 3D multimedia analysis that explore the multi-modality data to generate the strong 3D data representation. We believe this workshop will offer a timely collection of research updates to benefit researchers and practitioners in the broad multimedia communities.
We invite submissions for ICME 2025 Workshop, 3D Multimedia Analytics, Search and Generation (3DMM2025), which brings researchers together to discuss robust, interpretable, and responsible technologies for 3D multimedia analysis. We solicit original research and survey papers that must be no longer than 6 pages (including all text, figures, and references). Each submitted paper will be peer-reviewed by at least three reviewers. All accepted papers will be presented as either oral or poster presentations, with the best paper award. Papers that violate anonymity, do not use the ICME submission template will be rejected without review. By submitting a manuscript to this workshop, the authors acknowledge that no paper substantially similar in content has been submitted to another workshop or conference during the review period. Authors should prepare their manuscript according to the Guide for Authors of ICME. For detailed instructions, see here. Submission address is here.
The scope of this workshop includes, but is not limited to, the following topics:
Fast Review for Rejected Regular Submissions of ICME 2025
We set up a Fast Review mechanism for the regular submissions rejected by the ICME main conference. We strongly encourage the rejected papers to be submitted to this workshop. In order to submit through Fast Review, authors must write a front letter (1 page) to clarify the revision of the paper and attach all previous reviews. All the papers submitted through Fast Review will be directly reviewed by meta-reviewers to make the decisions.
Date | Description |
---|---|
9:30-9:40 | Opening |
9:40-10:05 | Keynote 1: Research on Key Techniques of Task-Oriented Point Cloud Sampling Based on Deep Learning |
10:05-10:30 | Keynote 2: Cross-Modal Vision-and-Language Intelligence: Methodologies and Applications |
10:30-10:55 | Keynote 3: Muldi-modal Modelling of Body Language for Digital Human |
10:55-12:15 | 8 Oral Presentation (~10min * 8) |
12:15-12:20 | Announce the Best Paper Award, Discussion and Closing |
Oral Order | Date | Paper Title |
---|---|---|
1 | 10:55-11:05 | Keypoint Ensemble For Image Matching |
2 | 11:05-11:15 | MF-Adapter: Better 3D Foundation Model with Multimodal Fusion Adapter |
3 | 11:15-11:25 | DHGS: Decoupled Hybrid Gaussian Splatting for Driving Scene |
4 | 11:25-11:35 | Optimizing Cooperative Multi-Object Tracking using Graph Signal Processing |
5 | 11:35-11:45 | Guided Model-based LiDAR Super-Resolution for Resource-Efficient Automotive scene Segmentation |
6 | 11:45-11:55 | LIVE-FIT: LED-based Immersive Virtual Environment with Fusion, Interaction, and Transmission |
7 | 11:55-12:05 | Benchmarking Learnable Mesh and Texture Representations for Immersive Digital Twins |
8 | 12:05-12:15 | MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis |
Previous Workshops on 3DMM: 3DMM-ICME2022, 3DMM-ICME2023 3DMM-ICME2024
If you have any questions, feel free to contact < peng [DOT] dai [DOT] ca [AT] ieee.org