Running Maestro mobile E2E tests in CI

  • Status: proposed

  • Deciders: app team, devops, managers

  • Date: 2026-01-30

Context and Problem Statement

We run end-to-end (E2E) tests for the Ísland.isarrow-up-right mobile app using Maestroarrow-up-right flows. We need a CI setup that is reliable on both iOS and Android, provides actionable debugging information when failures occur, and requires minimal maintenance.

We evaluated three approaches:

  • Run Maestro tests inside our existing Codemagic CI using local simulators/emulators.

  • Use a dedicated device-testing SaaS: Maestro Cloud.

  • Use a dedicated device-testing SaaS: devicecloud.dev.

After effort, we got Maestro E2E tests green on iOS in Codemagic, but not on Android. Android runs are flaky, dominated by timeouts likely caused by limited/contended CI resources. Debugging failures in Codemagic is also slow because it is difficult to consistently extract high-quality artifacts (e.g., recordings and structured run reports).

Both SaaS device-cloud solutions “just work” operationally and provide UIs, recordings, and retry capabilities. However, Maestro Cloud is comparatively expensive, while devicecloud.dev appears to offer a better cost-to-value ratio, despite being a smaller operation.

Decision Drivers

  • Reliability across platforms: Reduce Android flakiness and timeout failures seen on Codemagic-hosted emulators.

  • Debuggability: Provide fast access to artifacts (recordings, logs/reports) and a UI for triage.

  • Operational simplicity: Avoid maintaining simulator/emulator infrastructure and artifact plumbing in CI.

  • Cost efficiency: Keep recurring costs proportional to usage.

Considered Options

  • Option 1: Codemagic-only (run Maestro locally on Codemagic machines with emulators/simulators)

  • Option 2: Maestro Cloud

  • Option 3: devicecloud.dev

Decision Outcome

Chosen option: Option 3 — devicecloud.dev.

We will continue to build app binaries in Codemagic, but we will execute Maestro flows on devicecloud.dev for both iOS and Android.

Rationale:

  • It natively supports running Maestro flows.

  • It addresses the current core pain: Android flakiness from CI resource constraints.

  • It significantly improves debugging via hosted UI, recordings, and artifact access.

  • It is more cost-effective than Maestro Cloud while still providing a “device cloud” experience.

Positive Consequences

  • Improved stability for Android E2E compared to Codemagic-hosted emulator execution.

  • Better failure triage: recordings and reporting are readily accessible and consistent.

  • Lower maintenance: less time spent tuning CI machines, emulator settings, and artifact extraction.

  • Cost alignment: better value than Maestro Cloud for our current scale.

Negative Consequences

  • SSO may require enterprise-style purchasing:

    • Maestro Cloud lists SSO under its Enterprise tier.

    • devicecloud.dev offers enterprise SSO to orgs who have purchased $2000 or more in DeviceCloud credits.

    • Given the low security-risk and small number of active users (2–3), we accept non-SSO access for now, and will revisit if adoption grows or policy requires it.

  • External service dependency: test execution relies on a third-party platform.

    • This is considered minor because Maestro flows are portable and we can switch vendors with relatively low effort.

Pros and Cons of the Options

Option 1: Codemagic-only (local simulator/emulator execution)

Good, because:

  • Keeps execution within existing CI without introducing a separate execution vendor.

Bad, because:

  • Observed Android flakiness/timeouts due to resource constraints.

  • Harder to collect consistent, high-quality debugging artifacts compared to device-cloud solutions.

  • Higher ongoing maintenance for stable emulator/simulator execution and artifact handling.

Option 2: Maestro Cloud

Good, because:

  • Purpose-built hosted execution for Maestro with CI integration and cloud reporting.

Bad, because:

  • Cost: priced per device per month; comparatively expensive for our needs. When written, costs 250$ per device so total of 500$ for iOS and Android without parallelisation.

  • SSO: listed as an Enterprise feature, which may be unnecessary at our current user count.

Option 3: devicecloud.dev (chosen)

Good, because:

  • Purpose-built hosted execution with a UI and artifact support designed for Maestro workflows.

  • Cost effective: When written, costs around 0,1$ per flow or 2-3$ to run the full suite on iOS and Android. That's at most 100$ per month if we run the full suite every night.

  • Includes 5x parallelisation for both platforms (depends on overall load).

Bad, because:

  • SSO gating: enterprise SSO is available but requires meeting a minimum credit purchase threshold.

Trigger Strategy (CI scheduling)

We will run the full Maestro suite nightly, but only when there are app changes since the last successful nightly run (i.e., changes affecting the mobile app code or test flows).

Nightly runs provide a predictable cadence and reduce CI load/noise while still catching regressions quickly enough for the current workflow.

Data / Security Considerations

  • E2E tests must run using mock/non-sensitive test data only (no production identities, no real user data).

  • Test artifacts (recordings, screenshots, logs) should therefore contain no sensitive information.

  • Because the Ísland.isarrow-up-right app is open source and the E2E suite is designed around mock data, sending binaries and flows to a device-cloud provider is considered a low security risk, provided we enforce the “no sensitive data” rule and store provider API keys as CI secrets.

Last updated

Was this helpful?