Xinqi Zhang Xinqi Zhang

Bio Computations: Redefining Human Input in Digital Systems

In the realm of digital technology, the evolution of human input methods has been a journey of continuous innovation. From the clack of typewriter keys to the gentle taps on a smartphone screen, our methods of communicating with machines have always been a reflection of our technological capabilities. Today, we stand on the brink of a new era, one that goes beyond the physical and tactile—enter the world of "bio computations."

The Evolution of Human-Digital Interaction

In the early days, interaction with computers was rigid and unforgiving, confined to keystrokes and command lines. The advent of graphical user interfaces introduced the mouse, making interaction more intuitive. The touchscreen revolution brought about a more direct form of input, aligning digital actions with human touch. However, these advancements still relied on a conscious effort to communicate with our devices.

Understanding "Bio Computations"

Now, imagine a world where our digital devices understand us just as our fellow humans do. This is where "bio computations" come into play. It's a simple yet profound idea: every human emits a plethora of biological signals, whether it's a hand gesture, the tone of voice, fluctuating heartbeats, or even the pattern of breathing. These are all forms of data, rich in information and ripe for interpretation by digital systems.

For instance, consider a fitness tracker that monitors your heart rate to suggest workout routines, or emotion recognition software in customer service bots that respond to your mood. These are the early seeds of a technology that promises to grow exponentially.

The New Era of Human-Computer Interaction (HCI)

The future of HCI lies in systems that don't just wait for our input but actively learn from it. These systems will use machine learning and artificial intelligence to understand the nuances of human behavior. Imagine a digital assistant that doesn't just respond to your commands but anticipates your needs based on your emotional state or a learning app that adapts its teaching style to your current cognitive load, measured through your eye movements.

Challenges and Opportunities

This shift isn't without its challenges. Interpreting bio-signals accurately requires sophisticated technology and a deep understanding of human biology and behavior. Privacy and security take on new dimensions when dealing with such personal data. However, the opportunities are vast. In healthcare, for example, such technology could lead to early detection of conditions through routine monitoring of physiological signals.

Building a Safe and Responsive System

As we venture into this uncharted territory, the need for ethical guidelines and user consent is paramount. Users must have control over what data is collected and how it's used. Additionally, robust security measures are necessary to protect this sensitive information.

Conclusion

The concept of bio computations opens up a world where our interactions with digital systems are as natural and effortless as breathing. It's a journey filled with immense potential, challenges, and excitement. As we embrace this new era of HCI, we look forward to more intuitive, responsive, and, most importantly, human-centric digital experiences.

Read More
Xinqi Zhang Xinqi Zhang

3 Papers I Wish I'd Written

Thank you to Professor Kai Lukoff for providing template "3 research papers that I wish I had written," which have helped me delve into research reading and discover my main interest in HCI.

This article lists my top 3 recently read research papers on HCI. Essentially, they relate to innovative HCI applications using Generative AI.

Paper1: Closer Worlds: Using Generative AI to Facilitate Intimate

Papaer Title: Closer Worlds: Using Generative AI to Facilitate Intimate

Link: https://dl.acm.org/doi/10.1145/3544549.3585651

Key research question(s):

  1. How do digital communication tools affect our ability to experience deep emotional intimacy, and can these tools be improved to better foster a sense of connection?

  2. Can games and generative AI art be used to counteract the trend of digital communication tools limiting emotional intimacy?

  3. What design principles, inspired by facilitation methods, can be effective in fostering emotionally intimate conversations in a ML-assisted 2-person game?

  4. How effective is the "Closer Worlds" game in eliciting self-disclosure and fostering emotional intimacy compared to social games without generative AI?

  5. Do the affordances provided by visualizing shared values through a co-creative game lead to comfortable and meaningful conversations?

Key Methods:

  1. Design and Development of "Closer Worlds":

    • The researchers have created a machine learning-assisted game for two players intended to encourage emotionally intimate conversations through co-creative world-building.

  2. Exploration of Design Principles:

    • The paper discusses design principles derived from facilitation methods that may help in encouraging emotional closeness within the game environment.

  3. Pilot Study:

    • A pilot study is conducted to assess the effectiveness of the game. This involve interviews, observations, surveys, or behavioral analysis, to measure self-disclosure and emotional intimacy.

  4. Comparative Analysis:

    • A comparison is made between Closer Worlds and a social game without generative AI to see how each affects the level of self-disclosure among participants.

  5. Assessment of Enjoyment and Novelty:

    • The researchers also assess the participants' enjoyment and the novelty of the experience provided by the game, which can influence the game's effectiveness in fostering intimacy.

  6. Discussion on Future Applications:

    • Finally, the paper discusses how co-creative games might leverage generative techniques in the future to create pro-social environments.

Paper2: WorldSmith: Iterative and Expressive Prompting for World Building with a Generative AI

Papaer Title: WorldSmith: Iterative and Expressive Prompting for World Building with a Generative AI

Link: https://dl.acm.org/doi/10.1145/3586183.3606772

Key research question(s):

  1. How can multi-modal image generation systems be used to assist in the process of fictional worldbuilding?

  2. What are the limitations of current "click-once" prompting UI paradigms in generative AI, and how can they be improved to benefit the creative process?

  3. Can a system like WorldSmith enable novice worldbuilders to visualize their concepts effectively and with greater ease?

  4. How do iterative visualization and modification techniques (text input, sketching, region-based filling) impact the creative process in fictional worldbuilding?

Key Methods:

  1. Development of the WorldSmith Tool:

    • The paper discusses the creation of a tool that combines text input, sketching, and region-based filling for users to visualize and edit their fictional worlds.

  2. Formative Study:

    • A preliminary study involving 4 participants was conducted to gather initial feedback on the tool's functionality and usability. This involve qualitative methods such as interviews and observations.

  3. First-Use Study:

    • A larger study with 13 participants was then performed to see how new users interact with the WorldSmith tool. This involve a mix of qualitative and quantitative methods, including task performance measurements, questionnaires, and user feedback sessions.

  4. Expressive Interactions Analysis:

    • The study includes an evaluation of how the tool allows for expressive interactions between the users and the prompt-based models.

  5. Comparative Analysis with Existing UI Paradigms

  6. User Empowerment Assessment

Paper3: DeepScope: HCI Platform for Generative Cityscape Visualization

Papaer Title: DeepScope: HCI Platform for Generative Cityscape Visualization

Link: https://dl.acm.org/doi/10.1145/3334480.3382809

Key research question(s):

  1. How can the process of creating high-quality streetscape visualizations be improved to support urban design and planning?

  2. Can a Generative Neural Network (DCGAN) combined with a Tangible User Interface (TUI) streamline the visualization process for urban design, especially in real-time, multiparty design sessions?

  3. What are the design, development, and deployment considerations for a platform like DeepScope?

  4. How does DeepScope potentially impact the urban design process, and what are its practical implications?

Key Methods:

  1. Design and Development of DeepScope:

    • The paper describes the conceptualization and creation of the DeepScope platform, detailing the integration of a DCGAN with a TUI.

  2. Technical Implementation:

    • The implementation phase involves the actual coding, algorithm development, and hardware setup necessary to create a functioning prototype.

  3. Real-Time Urban Planning Simulation:

    • DeepScope is utilized in a simulated environment to assess its performance in real-time urban planning scenarios. This involve the use of real-time feedback systems.

  4. Multi-Participant Testing:

    • The system is tested in an environment with multiple participants to simulate a real-world urban design session.

  5. Feedback and Iteration

  6. Case Studies or Deployment Narratives

  7. Discussion and Analysis

Read More
Xinqi Zhang Xinqi Zhang

Future HCI with the impact of generative AI

The Evolution of Human-Computer Interaction(HCI): From Command Lines to Conversational Interfaces

The story of human-computer interaction (HCI) is one of constant evolution, marked by humanity's quest to make digital technology more accessible, intuitive, and aligned with our natural behaviors. This journey has witnessed several paradigm shifts, reflecting the broader technological trends and societal needs of their times.

The Dawn of HCI: Command-Line Interfaces

The early days of computing were dominated by command-line interfaces (CLI), which required users to interact with computers through a series of typed commands. This form of interaction demanded a steep learning curve and a strong understanding of specific command languages, inherently limiting the user base to those with technical expertise.

The GUI Revolution: Widening Accessibility

The introduction of graphical user interfaces (GUI) in the 1980s marked a significant leap forward. Pioneered by companies like Xerox, Apple, and Microsoft, GUIs replaced text-heavy commands with visual icons and menus, making computers more user-friendly and vastly expanding their appeal to the general public.

The Rise of the Internet: Browsers and Search Engines

With the advent of the internet, browsers and search engines became the new frontiers of HCI, allowing users to navigate the vast web of information through hyperlinks and search queries. This era cemented the "query-response" model of HCI, where users input queries and the system provides the best possible results based on algorithms.

Mobile and Touch: The Gesture-Based Interaction

The proliferation of smartphones and tablets introduced touch as the new mode of interaction, with swipes, taps, and pinches becoming second nature. This period saw HCI become more intimate and direct, as devices became extensions of the human body.

Voice Assistants and Conversational UIs: Speaking Naturally

The emergence of voice assistants and conversational interfaces further humanized HCI, enabling people to interact with technology through spoken language. This shift is characterized by the development of AI-driven technologies like Siri, Alexa, and Google Assistant, which can interpret and act on voice commands, making technology even more seamlessly integrated into daily life.

Toward Intuitive and Generative Interactions

Now, we stand on the brink of another significant transition, driven by generative AI and AR. As AI becomes more generative and anticipatory, it is shifting from reactive systems that wait for user input, to proactive systems that offer a multitude of possibilities before a user even makes a specific request.

From Tool Use to Partnership

We are moving beyond the concept of computers as tools and towards the idea of them as partners in the creative process. The HCI of tomorrow is not just about issuing commands and receiving outputs, but about engaging in a dialogue with technology, where it understands not just the literal meaning of our words, but the intent and emotion behind them.

The Immersive Turn with AR

Augmented reality is transforming HCI by overlaying digital information onto the physical world, creating a seamless blend of real and virtual that opens up new dimensions for interaction and experience.

Looking Back to Look Forward

By reflecting on the history of HCI, we can appreciate the magnitude of the shift that lies ahead. Each era of HCI brought technology closer to the natural human modes of interaction—from text, to touch, to voice, and now, towards a blend of intuition, creativity, and reality augmentation.

The development of HCI has been a tale of increasing sophistication and personalization. As we research and shape the next chapter with generative AI and AR, we're not just creating new interfaces; we're crafting experiences that resonate more deeply with the human psyche and enable a more profound human-technology symbiosis.

My Research Interests

“Human ingenuity propels technology forward, while technology redefines the landscape of human-computer interaction.”

HCI & Generative AI

Generative AI is rapidly transitioning from a tool that creates content upon request to an intelligent partner capable of enhancing human creativity. Future software equipped with generative AI will not merely serve content but will anticipate needs and craft responses in ways that feel more human, more engaging, and surprisingly inventive.

Towards a Synergetic Relationship

The core of my research revolves around the changing dynamics between humans and software. Rather than acting as passive recipients of information, users could co-create with AI, iterating on their ideas in real-time and watching as AI instantly generates multiple representations of those ideas. This collaborative form of HCI could significantly accelerate innovation in fields such as graphic design, architecture, and beyond.

Creative Cohesion and Ethical Considerations

As we chart these new territories, the ethical dimensions of generative AI in HCI cannot be overstated. I am interested in exploring how we can maintain creative ownership, ensure responsible use, and prevent the misuse of such potent technology, all while fostering an environment that encourages creative exploration.

Generative AI & Augmented Reality

My second area of research fascination is the intersection of generative AI with AR technology to create customizable virtual environments for individuals. This combination has the potential to revolutionize not just entertainment and gaming, but also to have profound impacts on various societal aspects.

Mental Health Therapy

In therapeutic settings, AR-generated environments can be tailored to patients' therapeutic needs, providing a safe space for exposure therapy, or crafting serene landscapes for meditation and stress relief. The integration of generative AI could personalize scenarios based on a user's response to treatment, making mental health care more accessible and effective.

Education and Learning

In education, AR can bring learning to life, creating interactive, customizable experiences that adapt to a student's learning pace and style. Generative AI could contribute by creating dynamic content that reflects students' curiosities, thereby enhancing engagement and retention.

Accessibility for Disabilities

For individuals with disabilities, AR combined with generative AI can break down barriers by creating environments tailored to their needs and preferences. Whether it's translating text to speech in real-time or generating virtual sign language interpreters, the applications are vast and varied.

Professional Training

In professional spheres like medicine or aviation, AR simulations, powered by generative AI prompts, can provide hyper-realistic training scenarios. These could adapt to the learner's skill level and modify in real-time to present various challenges, preparing them for real-world situations more effectively.

Urban Planning and Environment

Urban planners could leverage these technologies to visualize and simulate changes to landscapes, infrastructure, and even predict the environmental impact of potential decisions. Such tools could engage community members in the urban design process, fostering a more democratic and inclusive approach to city planning.

Cultural Preservation

AR can be used to reconstruct historical sites or lost cultural heritage, offering an interactive educational tool that preserves the past. Generative AI can add to this by recreating historical events or simulating ancient civilizations, allowing users to experience history first-hand.



Reference paper:

GenAICHI 2023: Generative AI and HCI at CHI 2023 https://dl.acm.org/doi/10.1145/3544549.3573794

https://generativeaiandhci.github.io/

Combating Misinformation in the Era of Generative AI Models https://dl.acm.org/doi/10.1145/3581783.3612704

GenAIR: Exploring Design Factor of Employing Generative AI for Augmented Reality https://dl.acm.org/doi/10.1145/3607822.3618018

This article was brought to life with the assistance of ChatGPT, my AI collaborator, whose insights and generative capabilities helped shape the narrative and enrich the content. A testament to the very subject at hand, this partnership between human thought and artificial intelligence showcases the potential of what we can achieve when we blend the creativity of the human mind with the computational power of AI.

Read More
Xinqi Zhang Xinqi Zhang

AR Learning Path and Challenges

Intro:

Augmented Reality (AR) has permeated various sectors, from gaming to healthcare. As its applications grow, the need for skilled professionals in the domain is also surging. If you're aspiring to embark on an AR journey, this blog will guide you through the learning path and shed light on the main technical challenges faced in real-world use cases.

The AR Learning Path:

1. Foundational Blocks: Start with the basics. This includes computer science fundamentals, mathematics (especially linear algebra and 3D geometry), and the rudiments of physics.

2. Programming Foundations: Familiarize yourself with programming languages like C# (often used with Unity) or C++. Some AR SDKs also support Python and JavaScript.

3. Computer Graphics: Delve into 3D graphics, shading, and rendering. Tools like Blender or Maya can assist you in creating 3D models.

4. AR SDKs: Get hands-on with Software Development Kits. Unity's AR Foundation is a great cross-platform start, while ARKit and ARCore cater to iOS and Android, respectively.

5. Hardware Acquaintance: Understand the nuances of mobile devices, smart glasses like Microsoft's HoloLens, and the intricacies of cameras & optics.

6. UX/UI Paradigms for AR: AR experiences are unique. Dive into spatial design and 3D interface guidelines.

7. Prototype and Innovate: Start building. Begin with simple apps, and as your prowess grows, add layers of complexity.

8. Stay Abreast: The AR realm is dynamic. Follow leading voices in the industry, participate in forums, and attend conferences.

9. Advanced Territory: If you're looking to push boundaries, venture into computer vision and machine learning to add depth to your AR applications.

Real-World AR: The Technical Challenges:

While the journey in AR is exciting, it's not without its set of challenges. Here are some of the key technical roadblocks encountered in real use cases:

1. Tracking and Registration: Ensuring virtual objects align perfectly with the real world is tricky. Precise tracking mechanisms are required to prevent drifts and misalignments.

2. Limited Field of View: Most AR glasses have a restricted field of view, which can break immersion. Expanding this without compromising on form factor is a challenge.

3. Latency: Delays between user actions and AR responses can be disorienting. Ensuring real-time interactions with minimal latency is crucial.

4. Battery Life: AR applications, especially on mobile devices, can be power-hungry, leading to concerns about device longevity.

5. Content Creation: Developing high-quality, realistic 3D content for AR experiences can be resource-intensive and time-consuming.

6. Privacy Concerns: As AR apps often require camera access, there are valid concerns about user privacy and data security.

7. Fragmented Ecosystem: With various platforms and devices, creating standardized AR experiences that work seamlessly across the spectrum is a considerable challenge.

Conclusion:

The AR universe is vast and brimming with possibilities. While the learning path is structured and promising, the real-world application of AR brings its set of technical challenges. However, with technology rapidly advancing and the community's collective efforts, solutions are continually emerging, making AR more refined and immersive. For those willing to navigate through these challenges, the rewards – both in terms of innovation and career opportunities – are immense.

Happy Augmenting! ✨🩷

Read More
Xinqi Zhang Xinqi Zhang

Telesurgery Experiences with Augmented Reality

Remote surgery, also known as telesurgery, refers to the practice where surgeons perform procedures on patients located at a different geographic location using robotic systems controlled over a high-speed network. Integrating a sophisticated image recognition system like "Vision Pro" into remote surgery can offer various enhancements to the process. Here's a detailed use case:

Scenario: Advanced Telesurgery System with Vision Pro Integration

Objective: To enhance the accuracy, efficiency, and safety of remote surgeries by utilizing the Vision Pro's image recognition capabilities.

Components:

1. High-definition cameras in the operating room capturing multiple angles.

2. Robotic surgical instruments.

3. High-speed, low-latency communication network.

4. Surgeon's control console.

5. Vision Pro API integration.

Process:

1. Augmented Visualization:

- High-definition cameras stream live footage of the surgical area.

- Vision Pro processes the images in real-time to highlight essential anatomical structures, blood vessels, nerves, and other critical areas.

- The augmented visuals help the surgeon in making precise movements and decisions.

2. Gesture Recognition:

- The surgeon can use gestures to control certain aspects of the robotic instruments (e.g., zooming in/out, rotating the camera).

- Vision Pro recognizes the surgeon's gestures and translates them into commands for the robotic system.

3. Instrument Tracking:

- Cameras monitor the position and movement of all surgical instruments.

- Vision Pro ensures that instruments are being used correctly and alerts if an instrument is approaching a sensitive area or if there's a risk of unintended contact.

4. Assisted Tool Selection:

- Vision Pro can recommend appropriate surgical tools based on the current stage of the surgery or based on the visuals of the surgical area.

- This reduces the time surgeons spend choosing or changing instruments.

5. Documentation and Feedback:

- Vision Pro can automatically document key stages of the surgery, capturing important visuals and annotations.

- Post-surgery, it can provide feedback by analyzing the recorded footage against best practices for training and improvement purposes.

6. Safety Protocols:

- In case of connection issues or lag, Vision Pro can identify the problem and automatically place the robotic system in a 'safe mode', pausing any actions until connectivity is restored.

Benefits:

- Enhanced Precision: Real-time highlighting and annotations improve the surgeon's accuracy.

- Safety: Potential risks are identified immediately, reducing the chances of surgical errors.

- Efficiency: Gesture controls, assisted tool selection, and augmented visuals speed up the procedure.

- Training: Post-surgery analysis offers valuable insights for training new surgeons.

Potential Challenges:

- Reliance on Technology: Over-reliance on Vision Pro might make surgeons less vigilant, potentially leading to oversights if the system misses something.

- Network Dependency: The entire system, including Vision Pro's functionalities, relies heavily on a stable and fast network connection.

- Privacy and Data Security: Ensuring patient data remains confidential and secure is paramount.

When integrated into telesurgery, Vision Pro could be a game-changer, offering unparalleled assistance to surgeons and ensuring safer and more effective surgical outcomes. However, its introduction should be carefully managed, ensuring that surgeons remain the primary decision-makers and that technology serves as an aid, not a replacement.

Related paper to investigate:

Augmented Reality in Medical Practice: From Spine Surgery to Remote Assistance. Retrieved from [https://pubmed.ncbi.nlm.nih.gov/33859995/]

Remote Interactive Surgery Platform (RISP): Proof of Concept for an Augmented-Reality-Based Platform for Surgical Telementoring. Retrieved from [https://pubmed.ncbi.nlm.nih.gov/36976107/]

Applications of Mixed Reality Technology in Orthopedics Surgery: A Pilot Study. Retrieved from [https://pubmed.ncbi.nlm.nih.gov/35273954/]

Gesture Recognition in Robotic Surgery: A Review. Retrieved from [https://pubmed.ncbi.nlm.nih.gov/33497324/]

Read More
Xinqi Zhang Xinqi Zhang

Augmented Reality (AR) in Healthcare: Exploring the Possibilities

Augmented Reality (AR) offers a blend of the physical and digital worlds, presenting virtual information overlaid onto the real environment. Its application in healthcare can be transformative, enhancing the quality of care, improving medical training, and aiding in various therapeutic interventions. Here are some potential use cases and challenges:

Use Cases for Augmented Reality in Healthcare:

1. Medical Training and Education:

- Surgical Simulation: Allows medical students and surgeons to practice procedures in a simulated environment before performing them on real patients.

- Anatomy Visualization: AR can help students visualize complex anatomical structures, improving understanding and retention.

2. Surgical Assistance:

- Preoperative Planning: Surgeons can visualize the surgical area in 3D, aiding in planning the procedure.

- Intraoperative Navigation: During surgery, AR can provide real-time data overlays, such as the location of blood vessels, nerves, or tumors, helping surgeons avoid critical structures.

3. Physical Rehabilitation:

- AR games and exercises can make physical therapy more engaging. Patients can see visual feedback on their movements, ensuring they perform exercises correctly and aiding in recovery.

4. Visualization of Medical Imaging Data:

- Radiologists and doctors can view and interact with 3D reconstructions of MRIs, CT scans, or X-rays in real-time, aiding in diagnosis and treatment planning.

5. Remote Consultations:

- Using AR glasses, a doctor can overlay and share diagnostic information, annotations, or treatment recommendations during telemedicine sessions.

6. Patient Education:

- Using AR apps, patients can better understand their conditions, visualize their treatments, or learn about medication regimens, improving adherence and outcomes.

7. Cognitive Rehabilitation:

- For patients with cognitive disorders, AR can offer therapeutic games or tasks that train memory, attention, or problem-solving skills.

8. Procedural Assistance for Nurses and Technicians:

- Guided visualization for tasks like inserting a catheter, drawing blood, or placing an IV, ensuring accuracy and reducing errors.

9. Assistive Technology for the Visually Impaired:

- AR glasses can recognize and announce text, objects, or obstacles, assisting visually impaired individuals in navigation and daily tasks.

Challenges of Implementing Augmented Reality in Healthcare:

1. Accuracy and Precision: AR must provide accurate overlays, especially in surgical applications. Inaccuracies can result in medical errors.

2. Integration with Medical Devices: Integrating AR with existing medical devices or systems, especially older ones, can be complex and expensive.

3. Cost: Developing and implementing AR solutions, especially in resource-limited settings, can be costly.

4. User Experience: Ensuring that AR interfaces are intuitive and user-friendly is essential. Poor user experience can hinder adoption.

5. Regulations and Approval: Any AR application intended for medical use needs to meet regulatory standards and obtain necessary approvals, which can be time-consuming.

6. Privacy and Security: AR systems that store or transmit patient data must adhere to privacy laws and ensure data security.

7. Reliability: AR systems must be reliable, especially in critical scenarios like surgeries. System crashes or malfunctions can have serious consequences.

8. Training and Adaptation: Healthcare professionals need training to use AR tools effectively, which requires time and resources.

9. Physical and Psychological Effects: Prolonged use of AR can cause eyestrain, fatigue, or even psychological impacts if used in therapeutic contexts.

10. Technical Limitations: Current AR technology may still have limitations in terms of field of view, battery life, or real-time processing speed.

Despite these challenges, the potential of AR in healthcare is vast. As technology continues to advance and as solutions to these challenges are found, it's likely that the role of AR in healthcare will grow exponentially.

Related paper:

1.PubMed: [https://pubmed.ncbi.nlm.nih.gov/?term=Remote+Surgery+Augmented+Reality]

2.IEEE Xplore: [https://ieeexplore.ieee.org/search/searchresult.jsp?newsearch=true&queryText=Remote%20Surgery%20Augmented%20Reality]

Read More
Reflection Xinqi Zhang Reflection Xinqi Zhang

Why researcher?

Redwood at Santa Cruz

Redwood view near Santa Cruz

"In this vast world, there's an abundance of wonders waiting to be explored. I aspire to contribute positively and add value to our global community."

Read More