Z

Software Triage Engineer at Zeno

Zeno
Full-time
On-site
Key Responsibilities

Technical Triage & Investigation


Issue Identification: Monitor real-time logs and alerts from swap stations and fast chargers to identify emerging software failures.
Root Cause Analysis: Use Python and SQL to query databases and parse logs to determine if an issue is hardware-based, software-based, or network-related.
Asyncio Debugging: Analyze and debug asynchronous Python code used in our high-concurrency charging and swapping services.
System Health: Proactively monitor cloud-to-device connectivity and API health.


Liaison & Communication (SPOC)


Field Support: Act as the primary technical interface for local field technicians, providing remote guidance for software resets, configuration changes, and on-site troubleshooting.
Dev Team Liaison: Document complex bugs with reproducible steps, logs, and technical context to hand over to the core development team for permanent fixes.
Status Reporting: Provide daily summaries of open incidents, resolution times, and recurring patterns to stakeholders.


Immediate Resolution & Maintenance


Scripting: Develop and maintain internal Python scripts to automate common triage tasks or data extraction.
Patching: Deploy hotfixes or configuration updates to field devices under the guidance of the development team.
Database Management: Run complex SQL queries to correct data inconsistencies caused by intermittent connectivity or edge-case software bugs.


Required Skills & Qualifications

Technical Requirements


Python Proficiency: Strong experience with Python 3, with a deep understanding of Asyncio and concurrent programming.
Database Skills: Advanced SQL skills for data analysis, complex querying, and troubleshooting.
Linux/IoT Knowledge: Comfort working in a Linux environment and interacting with IoT devices via SSH or remote management tools.
API Knowledge: Understanding of RESTful APIs and WebSocket communication.


Professional Experience


2+ years of experience in Software Engineering, Site Reliability Engineering (SRE), or a high-level Technical Support role.
Prior experience in the EV industry, IoT, or telecommunications is a significant plus.
Proven track record of managing "live" production incidents in a high-pressure environment.