Mastering Console and Troubleshooting: My Big Tech Interview Experience
How I navigated command-line challenges and system diagnostics during a hands-on interview at a leading cloud provider
Today, I’d like to share my experience from the second interview at one of the biggest tech companies. This interview was focused on “Console and Troubleshooting”—a section dedicated to testing my command-line proficiency, understanding of operating systems, and troubleshooting skills. Interestingly, it didn't involve any coding, so I didn't have to write a single line of Python.
My interviewer was a seasoned veteran with 12 years at this company, and from the very start, the conversation flowed smoothly. Just like during the first round, we connected over a shared "IT language." We kicked things off by SSH-ing into a Linux machine, which led me to the first task.
Parsing Nginx Logs
The first task was familiar: parsing Nginx logs. A standard exercise, I had to count the number of successful (200 status code) and unsuccessful requests. This was already the third time I'd been asked to do something like this in the past six months.
For this task, I used grep (man page) with regular expressions to count the successful requests:
grep ' HTTP/[0-9.]* 200 ' access.log | wc -l
To count the unsuccessful requests, I could have used an inverted grep (grep -v
), but I also suggested simply subtracting the successful requests from the total number of lines. We both smiled and moved on to the next task.
Next, I needed to find the most frequently accessed URLs in the log. For that, my favorite tool is awk (man page). Here's the command I used to extract and count the URLs, sort them, and display the top three:
awk '{print $6}' access.log | sort | uniq -c | sort -nr | tail -n 3
Determining the System Type
The next challenge was more interesting: determining whether the system was a virtual machine, a physical machine, or a container. I started by checking the /proc/1/cgroup
file, which sometimes contains containerization details, but the file was empty. So, I turned to hostnamectl (man page), and it confirmed that the machine was a VM. Here’s an example of the output:
Static hostname: ixx
Icon name: computer-vm
Chassis: vm
Virtualization: xen
We then explored the machine’s processors using lscpu (man page).
Performance Diagnostics
Next, we discussed system performance, covering tools like top
, atop
, load averages, free
, and vmstat
, as well as processes with ps -ef
. I wasn’t able to give a detailed answer about process states like Zombie, Sleep, and Uninterrupted Sleep, but I hope to master this concept soon.
I encountered a high load average with only one process hogging half of a CPU core. (Note: top displays CPU usage per core, not for the entire system.) The question was: "What’s consuming so much CPU?" I suspected intense I/O operations, so we dove into iostat (man page), a tool used for monitoring I/O statistics and CPU usage in Unix-like systems. iostat revealed a high %iowait, which means the CPU was spending a lot of time waiting for I/O operations to complete.
The Finale: Investigating an Unknown Service
The final task involved investigating an unknown service (let's call it mi-service) that was performing read/write operations on disk. I used systemctl to check the service status, but that didn’t yield any results. So, I moved on to inspecting the logs in /var/log/mi-service
. Sure enough, I found the necessary information there. I also needed to determine which ports the service was using, so I ran netstat -tuln (man page) to identify the active connections.
There were a few more questions and discussions, but I don’t recall the details (or they’re not worth mentioning). Overall, I really enjoyed how this company handles interviews, and I look forward to the next round—if they offer it, of course!