Visual Generative Models: Past, Present, and Future

DICTA 2025 Workshop

📅 02/12/2025   |   ⏰ 9:00 am - 6:00 pm   |   📍 Location: Union House - Function room 1 (The University of Adelaide)

About

Recent breakthroughs in generative adversarial networks, diffusion and autoregressive models have dramatically advanced the state of visual content generation, including widespread applications in generating images, videos, 3D objects, and more. These advancements not only push the frontiers of synthesis quality and scalability but also unlock new applications in design, entertainment, vision, scientific domains, and even improving or reformulating the vision tasks. However, several fundamental and practical challenges remain, e.g., improving controllability, enhancing fidelity and realism, scaling across modalities, ensuring alignment with human values, and achieving efficient, safe deployment. This workshop aims to provide a broad forum for exploring the past breakthroughs, current developments, and future directions of visual generative models, with particular emphasis on foundational innovations, emerging challenges, and practical applications.

News

📍 Updated Location Announcement

We are excited to share that the venue for this workshop has now been confirmed! The event will take place at:

🏫 Union House – Function Room 1 (🗺️ Map)
The University of Adelaide


🗓 Agenda Now Available

The full workshop schedule including invited talks and oral presentations has been released — scroll down to view details.

👉 Click to jump to the agenda


🔥 Call for Oral Presentations

We invite researchers to join us as one of the SIX oral presenters at the workshop to showcase their outstanding research. If you are interested, please complete the Google Form with the details of the work you would like to present.
Note that you have to give an in-person presentation at the workshop.

Eligibility (any of the following):

Submission Information

Schedule

Time Session Speaker
Morning Session
09:00–09:10 Welcome & Opening Remarks Dr. Dong Gong
09:00–09:10 Keynote 1: Continue of Simulator Environments Show more ▼ Prof. Richard Hartley
10:10–11:10 Keynote 2: From generative image synthesis to 4D modelling (Online) Show more ▼ Prof. Chunhua Shen
11:10–11:30 Break and Coffe Chat
11:30–11:45 Oral Presentation 1: Toward Human-like Multimodal Understanding, Reasoning, and Generation Show more ▼ Qi Chen
11:45–12:00 Oral Presentation 2: Controllability Matters: From Static Generation to Interactive World Models Show more ▼ Zicheng Duan
Lunch Break 12:00–13:30 (Union House - Function room 1)
Afternoon Session
13:30–14:30 Keynote 3: End-to-end image generation training Show more ▼ A/Prof. Liang Zheng
14:30–14:45 Oral Presentation 3: Point-cloud–centric paradigm for general 3D scene and asset generation Show more ▼ Jiatong Xia
14:45–15:00 Oral Presentation 4: Growing Models on Demand: Dynamic Modular Expansion for Continual Learning Show more ▼ Huiyi Wang
15:00–15:30 Break and Coffe Chat
15:30–15:45 Oral Presentation 5: The Human Brain as a Blueprint for AI’s Frontier: Symbolic Perception, Vision-Grounded Reasoning, and Learning Through Experience Show more ▼ Shan Zhang
15:45–16:00 Oral Presentation 6: Exploring Primitive Visual Measurement Understanding and the Role of Output Format in Learning in Vision-Language Models Show more ▼ Ankit Yadav
16:00–16:30 Panal Discussion Keynote Speakers and Organizers
16:30–16:40 Closing Remarks

Invited Speakers

Richard Hartley Richard Hartley The Australian National University
Chunhua Shen Chunhua Shen Zhejiang University
Liang Zheng Liang Zheng The Australian National University

Organizers

Xinyu Zhang Xinyu Zhang University of Auckland
Lingqiao Liu Lingqiao Liu The University of Adelaide
Chang Xu Chang Xu The University of Sydney
Yujun Cai Yujun Cai The University of Queensland
Jiaxian Guo Jiaxian Guo Google Research
Anton van den Hengel Anton van den Hengel The University of Adelaide
Dong Gong Dong Gong The University of New South Wales