Skip to main content

5. Choosing Your Strategy: Element vs. Image

A strategic guide to building resilient workflows when software elements are hard to capture

Sophie avatar
Written by Sophie
Updated yesterday

The Hierarchy of Reliability

In any automation project, your first and most important decision is how you choose to interact with the screen. When you encounter software that resists standard capture, it is essential to follow a clear Priority Hierarchy rather than picking methods at random:

  • First Priority: Element Identification. This is your "Gold Standard." It is the most stable and precise way to automate because it speaks the software's native language.

  • Second Priority: Fallback Strategies. Use Keyboard, Mouse, or Image automation only when the primary method is unavailable. These act as your "human-like" backup plan, bridging the gaps where the internal structure of an application cannot be read.

Identity vs. Appearance: The Core Distinction

To make the right choice, you need to understand the fundamental difference between these two technologies:

Element Identification is like finding a person by their official ID card. It doesn't matter what they are wearing or how the lighting changes; as long as their ID (the underlying code and attributes) remains the same, RPA will find them with 100% accuracy. This method is resilient to visual changes like color or font shifts.

Image Recognition, on the other hand, is like recognizing someone by their appearance. It scans the screen for pixel patterns. While powerful, this method is sensitive—if the "outfit" of your software changes (due to shifts in brightness or color), the recognition might fail.

Pro Tip: To make image matching more robust, you can use the Grayscale Matching feature to ignore color shifts or adjust the Similarity Threshold to handle minor visual changes.

When Each Method Shines

Different scenarios require different tools. Use these patterns to decide which command to drag into your workflow:

  • Keyboard Automation: Best for extensive text input or rapid navigation using shortcuts like Ctrl+C and Ctrl+V. It is often the fastest way to drive a data-entry workflow.

  • Mouse Automation: Ideal for fixed-position interfaces. If a button's location is consistent and the layout doesn't shift, clicking at precise coordinates is a simple and dependable solution.

  • Image Automation: Indispensable for "Black Box" environments like legacy software, custom graphical apps, or games. If you can see it on the screen but can't find its code, recognize it visually and act on it.

The Hybrid Strategy: Staying Pragmatic

In the real world, the most stable automation often mixes all three techniques in a single workflow. Don’t be restricted by a single method; instead, layer your techniques for maximum resilience:

  1. Input with Keyboard: Use Send Keys commands for rapid text entry and field switching.

  2. Navigate with Mouse: Use a Click Mouse action to hit stable buttons or menu items.

  3. Verify with Image: Use Image Recognition features to confirm a task is complete (e.g., waiting for a specific "Success" banner to appear).

Summary: Building for Resilience

Mastering these distinctions allows you to choose the smartest technique for any project. By prioritizing Element Identification and strategically layering Keyboard, Mouse, and Image actions, you create workflows that are not only efficient but also resilient enough to handle real-world software challenges. This strategic mindset is what separates a basic script from a professional automation solution.

Did this answer your question?