After taking hundreds of developer interviews over the last ten years, I have realized that the available literature on programming, data structure, algorithms, system design and computer science, in general, has seen massive improvements. Most importantly, the content is easily accessible for free. The primary source of knowledge for learning these concepts has shifted from books and research papers to blogs, youtube videos and tutorials. But this comes with a downside.
If you are a young developer working in a startup or even an engineering leader, you will feel overwhelmed when you try to hunt what you should learn, a few concepts/publications/channels at a time. Not doing that will probably mean depriving yourself of the base knowledge required for building complex applications and platforms.
I want to address a few unsolved problems regarding available literature for developers and engineering leaders in this blog series. It's an assumption that these problems are unsolved. If you feel there is some corner of the Internet that solves one (or more) of these problems better, please mention them in the comments. You can also DM me on Twitter, Linkedin or Instagram
Problem 1 - There needs to be an index page for organizing the content in order
The lack of an index page is why many people sign up for paid courses that charge tons of money because there is a lack of structure to the content available for free on the Internet. You don't know where to start and where to go next. Consider the chapters and their content in this blog as a list of keywords you are supposed to search for or the questions you need to ask ChatGPT (or the Generative AI bot of your choice). I strongly recommend that you follow the order of the chapters. But, if you are familiar or relatively comfortable with one or more topics, feel free to skip them. Problem 1 is covered in this blog - part one.
Problem 2 - Many of these resources tend to cover the topic from a specific lens
For example, most tutorials only talk about ACID properties in the context of SQL databases. As an engineer, it's critical that if you are learning a topic, you are exploring it from all dimensions and not from a specific lens. Research that topic like you research a problem at work or how you research the next gadget you want to buy (insert your favourite research item here). We will cover this problem in part two of this blog series.
Problem 3 - A practical approach towards learning something new in tech is needed
If you start covering computer science and technology topics with a depth- or even a breadth-first approach, you will probably end up in an endless spiral of knowledge. It would help if you were more pragmatic about this. I will cover this topic in part three of this blog series.
An Index of topics you should learn and understand as a developer
I have compiled a list of concepts you should know (in order). I have added notes wherever necessary. The underlying assumption is that you are already a developer and know how to do basic programming. If you are not, start with that and revisit this list.
Chapter 1 - List of basic system design concepts/topics/questions that you should learn
- What is system design? - learn what system design is, why it is valuable, and its purpose while building complex technology products. Also, learn the steps involved in system design and the difference between low-level and high-level design.
- Back of the envelop calculation in system design - If mastered, this topic will save you thousands of dollars and hours. Although it's just one of the steps in system design, it deserves special mention. Practice it as much as possible.
- Vertical and horizontal scaling - learn the difference between horizontal and vertical scaling at a conceptual level.
- What is load balancing?
- CAP theorem - Don't move ahead without understanding the CAP theorem well.
- What meaning of Throughput and Latency?
- Difference between concurrency and parallelism
- How do HTTP requests (and protocol) work?
- How do WebSocket requests (and protocol) work?
- How do P2P networks work?
- How does Garbage collection work in various languages? - It's important to note that garbage collector implementation details can vary within each language and across different versions or implementations of the language runtime. So, try and understand why the garbage collection was programmed in that manner and why it changed with a different version of the language.
Chapter 2 - Knowledge of Databases
Like many programming languages, different types of databases specialize in a particular operation or workload. Relational databases traditionally ruled the databases world for a long time. But, with increasing user traffic, changing data access patterns, data writing patterns, and data analysis patterns, simple relational databases became one of the biggest bottlenecks in making a backend system scalable. That's what led to the birth of these special-purpose databases. Although most of them want to become the de-facto database choice for all types of applications (for becoming successful as a business, the database is a vast DBaaS business opportunity), many of them are helpful only in specific areas of use cases in real life. The following are the most prominent types that you will find in the market today -
- Relational databases - Eg. MySQL, PostgreSQL, MariaDB, Oracle, Microsoft SQL Server, etc.
- Document-based databases - Eg. MongoDB, CouchBase, etc
- In-memory databases - Eg. Redis, etc
- Graph Databases - Eg. Neo4J, Amazon Neptune, etc
- Wide column Databases - Eg. Apache Cassandra, HBase, etc.
- Time-series Databases - Eg. InfluxDB, TimescaleDB, etc.
- Vector Databases - pinecone, chroma etc
You should also read the basics of each database type and learn the architecture they were built on. Research the use cases where they fit and why they serve that use case well. Understanding the problem they solve and the solution will accelerate your knowledge by 5-10 years.
Chapter 3 - Basics of computer networking and operating system
This is from my own experience. I learned the "operating systems" and "computer networking" subjects in my bachelor's degree in computer science, just like many people did. And I never took those subjects seriously (just like most people don't). The reason was simple. I just wanted to pass my exams. I could never understand the impact of these concepts in real life while building backend applications and designing software architecture. Unfortunately, I had to learn their importance the hard way. Many systems I designed at the beginning of my career didn't scale after a certain point because of my lack of knowledge about networking and operating system. I recommend learning both these topics at a fundamental level and then learning their evolution over the last two decades.
Chapter 4 - Knowledge of deployment and cloud
If you know how to build web and mobile applications, you should also know how to run them and make them accessible to real users. Cloud and deployments are topics that are big enough to create multiple courses on them. But you can start learning by focusing on things that cater to 90% of use cases. Following is the list to begin with. Learn them in the context of the three biggest cloud providers in the world Amazon web services(AWS), Google cloud platform (GCP), and Microsoft Azure -
- Cloud networking - Example Amazon VPC
- Virtual machines - Examples are Amazon EC2, Google compute engine virtual machine instances and Azure virtual machines.
- File storage (media storage) - Examples are Amazon S3, Google Cloud Storage, Azure Blob storage, etc.
- Managed databases - Examples are Amazon RDS, Amazon DynamoDB, Google Cloud Spanner, Azure CosmosDB, etc.
- Container solutions - Examples are ECS, Google Cloud run etc.
- Serverless - I recommend learning about serverless container solutions instead of serverless functions—for example, AWS fargate. I believe serverless functions will soon be a thing of the past (another blog-worthy discussion).
Chapter 5 -Advance system design concepts and case-studies
After you have learned the above chapters, It will be an excellent time to jump to some advanced topics such as -
- ACID and BASE databases
- Indexing in Databases - In-depth
- Redundancy in databases
- Replication in databases
- Partitioning (and sharding) in databases
- Distributed locking techniques
- Hashing in databases
- Distributed cache system design
- How do cache databases work?
- Gossip protocol
- Operational transformation
- Differential synchronization
- CRDT (Conflict-free replicated data type)
- MapReduce Technique
- Streaming processing at scale - Micro batches, continuous operator model - Apache storm
- Extract Transform logic (ETL)
- The technique of sending redundant data over UDP - multiplayer game design.
- PACELC theorem
Chapter 6 - Knowledge of testing
Why are we talking about testing? Isn't that supposed to be a different career? The answer is No. In many companies, Developers (both frontend and backend) are also expected to write a type of test called Unit tests. Most modern languages and frameworks come with the support of writing unit tests. This approach to software development is called Test Driven Development (TDD). If you want to excel as a developer, testing applications and end-to-end systems from the point of view of a developer and end-user is essential. Learn TDD in the context of your programming language and how to figure out edge and corner cases.
Chapter 7 - Data Structures and Algorithms
You will find a lot of people arguing over this on the Internet. Many people have different takes, opinions, and perspectives on how much knowledge of the data structures and algorithms is essential for a day to day developer job. And I have an opinion too. But my job is to give you pragmatic and actionable advice, not opinions. I will probably write another detailed blog about it, but to begin with, you should cover the following data structures and algorithms to become good at programming and problem-solving. List of data structure concepts to cover -
- Asymptotic analysis and Time complexity
- Basic data structures - Arrays, stack, queues, heap, linked list, hash table, dictionary etc
- Advanced data structures - Tree, Graph, B+ trees, AVL tree etc
- Data structures for hashing - Chained bucket hashing, extendable hashing, Linear hashing, modified linear hashing, Bitmap indexing, B tree
- Bloom filter
- Merkel tree
List of Algorithm concepts to cover -
- Sorting - Bubble sort, Insertion sort, quick sort, merge sort, selection sort, Stable sort
- Trees traversal and search (Breadth-first search and depth-first search)
- Binary search tree
- Greedy algorithms
- Dynamic programming
- Graph algorithms
This is just the beginning
If you think completing the above list will immediately transform you into a 10X developer (or insert your favourite developer adjective here), think again. In the world of computer science and software engineering, the above list is your starting point. The eventual goal on your end should be understanding all the above topics to develop a problem-solving mindset. Mugging them up may help you clear some tech interviews, but it will never make you a good developer.
I will cover the remaining problems in parts two and three of this series. Till then, share this article with your friends and colleagues. If you have any questions, feel free to mention them in the comments. You can also ping me directly on Twitter, Linkedin or Instagram