Wednesday, August 7, 2024

CST 334 Week 8 Report

This week, we delved more deeply into the essentials of persistence in operating systems. We began by examining the fundamental interactions between the OS and hardware devices. Efficient communication with a device relies on two key components: the hardware interface and its internal organization. To minimize CPU load, three main techniques are used: interrupt-driven I/O, programmed I/O, and direct memory access (DMA). DMA is particularly advantageous for systems handling large volumes of data, especially with frequent transactions, as it reduces the need for constant CPU involvement during data transfers. We also studied the ways the OS interacts with devices, focusing on explicit I/O instructions and memory-mapped I/O. Additionally, we learned that device drivers play a crucial role in abstracting device operations from the OS, through software that defines the device's functionality. Our exploration continued with the basics of hard disk drives. Modern hard disks feature a platter and spindle, with data stored in concentric circles called tracks on each surface. A disk head and arm are used to read this data. Various disk I/O scheduling algorithms were discussed, ranging from simple methods like first-come, first-served to more advanced ones such as budget fair queuing. In the realm of file systems, we covered persistent storage devices like HDDs and SSDs. We focused on the core abstractions of storage: files and directories, which are fundamental to data persistence in the OS. We explored the file system interface, including file creation, access, and deletion. We concluded the week by implementing a basic file system using vsfs (Very Simple File System), a simplified model of a typical UNIX file system. Key takeaways included understanding the structure and access methods of file systems, learning about the Inode (index node) for file metadata, and exploring multi-level indexing, directory organization, and free space management. Overall, it was a productive week of learning, and I look forward to building on these foundational concepts.

When it comes to persistence of personal character, I learned quite a bit over the course. I learned that even when I am confused during an assignment, to simply sit with the challenge and continuously examine it until it truly sinks into my understanding. I also learned that I can sometimes rely on others to clarify for me instead of trying to brute force solve it on my own. Thankfully, Dr. Ogden was helpful in slack and helped to clear up any confusions. I learned that by developing more resilience and discipline, I can accomplish any task as long as I stay focused. Thanks. 

Thursday, August 1, 2024

CST 334 Week 7 Report

 This week, we learned a ton about the fundamentals of persistence in operating systems. We started by looking at basic device interactions with the operating system. The device itself requires two parts to make interaction efficient: the hardware interface and its internal structure. There are three main mechanisms employed to reduce CPU overhead: interruption system, programmed I/O, and direct memory access (DMA). For systems with larger memory volume transactions, DMA is superior, especially if data transactions are frequent, because the CPU does not have to be constantly used during transfers. When it comes to how the OS interacts with the device, there are two primary methods: one is to have explicit I/O instructions, and the second is known as memory-mapped I/O. Finally, the device driver is what specifically abstracts the device function away from the OS by way of a piece of software that details how a device works. Afterwards, we learned about the basics of hard disk drives. Modern disks have a platter and a spindle. Data is encoded on each surface in concentric circles of sectors that are called tracks. We read data from the surface with a disk head and arm. There are numerous disk I/O scheduling algorithms that can be employed, from more basic ones like first come, first serve to modernized algorithms like budget fair queuing. When it comes to file systems, we learned about persistent storage and devices like HDDs and SSDs. The two basic abstractions developed regarding storage are files and directories, which comprise the bread and butter of persistence in the OS. We explored the file system interface including creating, accessing, and deleting files. We wrapped up our week by learning a simple file system implementation through the vsfs (very simple file system), which is a simplified version of a typical UNIX file system. When thinking of file systems, we should be thinking about two primary aspects: the data structures of the file system and the access methods required to actually do things with data. We learned about the Inode, or index node which is the structure that holds metadata for a given file. We were able to learn more about multi-level indexing, directory organization, and free space management within a file system. All in all, it was a good week of learning and I hope we can continue to build on these basic concepts.

Saturday, July 27, 2024

CST334 Week 6 Report

 Concurrency: Part II

This week, we learned a ton more about concurrency in operating systems. Notably, the main topic was semaphores, which are essentially an upgraded version of our previous basic locks and condition variables which we can use to improve system performance, especially in multi-threading applications. Our book specifically defines a semaphore as an object with an integer value that we can manipulate with two routines, which in the posix standard are sem_wait() and sem_post(). It is important to remember that the initival value of a semaphore defines its behavior, so it must first be initialized to some value. The first type of semaphore we studied is the binary semaphore, used as a lock. Next, we learned how to implement semaphores for ordering events in a concurrent program. These semaphores can be very useful to use when a thread which is waiting for a list to become non-empty, so it can delete an element from it. Specifically, the semaphore would signal when another thread has completed, in order for the waiting thread to awaken into action, much like how condition variables work. We also learned about the producer/consumer problems and dining philosopher problems as means to understand semaphores on a deeper level. Avoiding concurrency bugs, including deadlock, was very helpful especially since we may implement semaphores ourselves in the future.

Sunday, July 21, 2024

CST334 Week 5 Report

 CST334 Week 5 Journal

This week, we learned a ton about the basics of operating system concurrency. One of the most fundamental things we learned is the thread. A thread is basically a computational unit within a process. Although a single thread is semi independent, it shares a central logical address space with other threads, allowing them to access the same data. At the same time, one process can have many threads. We use threads in order to support parallelism and also to avoid blocking program progress due to slow IO. Another very important concept we learned is the lock, which is designed to help us execute a series of instructions atomically. By placing locks around critical sections in code, programmers can ensure that the section is executed as if it is a single atomic instruction. We implement locks by declaring lock variables, which hold the state of the lock (available or acquired). We evaluate the efficacy of a particular lock type by looking at several goals: mutual exclusion, fairness, and performance are the main objectives. The ticket lock has a key advantage over the spin lock in that it prevents a thread from starving, since it ensures that at some point in the future it will acquire the lock. Several data structures can be utilized in the implementation of concurrency, and we can achieve this by adding locks with performance in mind. There are concurrent counters, linked lists, queues, and hash tables, which all have pros and cons, particularly apparent when factoring scalability in. It is important to keep several things in mind though: more concurrency does not necessarily increase performance and performance problems should only be remedied once they exist. Threads can utilize a condition variable in order to bypass our problem of a thread traditionally spinning in an inefficient manner until some condition is true. In a nutshell, the thread that is waiting in the condition variable queue is signaled to by another thread to awaken and continue.

Saturday, July 13, 2024

CST Week 4 Post

 CST Week 4 Summary

This week we learned a ton about how the operating system manages virtual memory, especially paging. Paging is basically an alternative to the segmentation approach when it comes to managing memory. Instead of splitting up a process' address space into some number of variable sized segments, we divide it into fixed-size units, each of which is called a page. There are numerous advantages of paging, from avoiding external fragmentation to being very flexible and enabling the spare use of virtual address spaces. 

To be blunt, we cannot simply implement paging into our system willy nilly - we have to take numerous factors into account when doing so in order to ensure good memory management. One is ensuring that we have a tranlsation-lookaside buffer, or TLB in order to cache frequently used virtual-to-physical address mappings. The main purpose of the TLB is to keep our system running quickly - we wont have to perform a full page table search for an address mapping if it is in the TLB. Another important technique for ensuring good paging function is to implement a hybrid of small and large table sizes: instead of having a single page table for the entire address space of the process, we can have one per logical segment. We may thus have three page tables, one for the code, heap and stack parts of the address space. In addition, the OS actually packs away infrequently used portions of address spaces into hard disk drives, in order to maintain a single and large address space overall. When memory is near full, the OS will also page out one or more pages to make room for the new page about to be used. The process of picking a page to kick out or replace is known as the page replacement policy, and there are several. Notably, the optimal policy, developed by Belady, decrees that it is most optimal to replace the page which will be accessed furthest in the future. The optimal is an ideal template which developers can approach through implementing their own policies, which include FIFO, random, and LRU among others.

Wednesday, July 3, 2024

CST334 Week 3 Report

 CST334: Weekly Learning Summary Pt. III

This week, we learned a ton in Operating Systems. We learned mostly about how the OS virtualizes memory. In essence, we are looking at how the OS utilizes hardware based address translation. The OS creates an easy to use abstraction of memory by way of the address space: this space contains all the memory state of the running program: the code, the stack, the heap, etc. Each process has not a real memory address, but rather a virtual memory address that must be translated into a real, physical memory address by the OS whenever we are creating or modifying a process.With dynamic relocation, we use a base register to transform virtual addresses into physical addresses; furthermore, a bounds register ensures that addresses are within the confines of the address space.These base and bounds registers are typically managed by a part of the CPU known as the memory management unit, or MMU. On an important note, both internal and external fragmentation are necessary evils in this address translation model, and our job as computer scientists is to try to minimize these while managing memory.


Saturday, June 29, 2024

CST334 Week 2 Report

Operating Systems Week 2 Summary

This week we learned a ton about processes and how the operating system manages them. The operating system virtualizes one or a few CPUs in order to run many processes concurrently. The operating system employs a combination of low level machinery and higher level scheduling policies to accomplish this. On the low level part, we have techniques such as context switches, which is utilized when changing the currently running process to a new one. We also see various system calls such as fork(), exec(), and wait(). In order to protect the system from any negatives a process may incur, we utilize limited direct execution - the process must run under limitations imposed by the operating system. While the process typically runs in user mode, the operating system by default runs in kernel mode, which means it has unlimited access to machine resources. 

We must combine low level machinery with higher level policies, or disciplines - this is while trying to simultaneously optimize for performance metrics such as turnaround time and response time. There are several approaches that can be considered, from shortest job first to round robin, in determining how to schedule processes. Ultimately, a more optimal approach is using a Multi-level feedback queue, or MLFQ. In this treatment, we have a number of distinct queues, each with a different priority level. The process with a higher priority level runs first in a round robin fashion with processes sharing its priority level, which means each process runs for a predetermined time slice, or scheduling quantum in alternating fashion with other processes. When each job's allotment is spent, its priority level is automatically decremented - this is so that more interactive processes can stay higher priority, while longer and more CPU intensive processes remain in the lower priority levels.

Wednesday, June 26, 2024

CTI: Personal Value Proposition (Beta Version)

 Dear Amazon and Google,

I heard that you are in need of a network engineer. I am here because I believe we can work together to produce a better level of stability and function in all areas of your workflow and in those which clients navigate in.

As a network engineer with five years experience having a notorious reputation for improving company network reliability, you can count on me to accomplish the following:

  • reduce network downtime by at least 50%
  • continually design new and improved implementations for local network
  • fully integrate within the current team and enhance group productivity

As leading companies within the software industry, you will benefit from the items I can bring to the table. I would like to speak with you on how we can begin improving our network strategies. 

Sincerely,

Luis

Monday, June 24, 2024

CST334 Week 1 Report

 Summary of Week 1

This week, we learned a ton in CST334. Most notably, we learned the basics of C and Linux Bash, as well as some introductory operating system concepts. I really enjoyed getting introduced to these concepts because they will help us see the overall picture of how operating systems work under the hood. For example, operating systems actually "trick" applications to think they have unlimited access to the CPU(s) and or memory by virtualizing the CPU and memory. To me, that is a genius invention and I am hyped to learn more about it. Furthermore, we will learn about concurrency and persistence, the other main branches of modern operating systems. 

Regarding C, it is a language very similar to C++, although it is slightly lower level, so we tend to work more closely with memory addresses - on top of C being considered a 'procedural' language rather than being object oriented. It makes sense for us to pair C with Linux in this regard. The Linux Bash is basically the terminal for Linux where we are able to manage files and run programs. We learned how to work with Docker to "contain" Linux and run Linux programs by way of the windows powershell, which was a blast. I really like how we can even debug programs with GDB.

Wednesday, June 5, 2024

CST 363 Week 8 Report

 Most Important Things Learned During CST 363: Introduction to Databases

  1. The most important thing learnt, if I had to choose, is how to work with SQL and MongoDB in Java. I really liked labs 19 and 21 for this, because we learnt quite a bit about how we can manage databases using Java. We were able to put everything together during these labs and really test our skills.
  2. The second most important thing I learned is how to write SQL queries, because this was the gateway into learning about databases. I think starting with SQL queries was a perfect way to get us to fundamentally understand how databases work. 
  3. The third most important thing I learned is how databases work beneath the surface. While theoretical knowledge like this may not be immediately practical, I think in the long term it will serve us well to at least be aware of how databases really work. I enjoyed learning the differences between relational and document databases, in particular.

CST 363 Week 7 Report

MySQL and MongoDB are both great programs one can use to manage databases.

Similarities: Both programs are open source and can also be installed on many different operating system such as Windows, Mac, and Linux. Both programs support indexing and large amounts of data, as well as sharding which can help to distribute data across many machines.

Differences: MySQL is technically a relational database, which means it is centered on tables, whereas MongoDB is NoSQL, document based. In MongoDB, data is stored in documents, specifically BSON files which are closely related to JSON format. MongoDB has a proprietary query language, whereas MySQL uses SQL, or structured query language. Furthermore, MongoDB does not use schemas, and each document can have a different structure, whereas MySQL relies heavily on schemas and every table must have a predefined schema structure.

When to choose MySQL: MySQL should be chosen whenever we require structured data and very reliable, consistent transactional integrity. Another indicator for MySQL is when we require complicated queries and joins. Some applications could be those that work with e-commerce and web apps.

When to choose MongoDB: MongoDB should be selected as our program when there is a need to store large volumes of unstructured data. Another good indicator would be if we require flexible schemas and fast, plentiful write transactions. Some applications could be those that work with real time analytics.

Friday, May 31, 2024

CST 363 Week 6 Report

Summary of Week's Learning

This week, we learned a ton about spring web server and also JDBC API. JDBC is a Java based API that enables us to interact with databases. The spring web server is like a programming component that sort of acts like an environment in which we can create web applications. We learned how to connect these two with a custom made SQL data schema, which was actually pretty fun. I really liked how we were able to see real time updates from our spring web server updates in our Mysqlworkbench queries. This makes sense because they were using the exact same database, through JDBC api access. Overall, it was a great week full of learning and I look forward to learning about mongoDB next week.

Sunday, May 19, 2024

CST 363 Week 5 Report

Generally speaking, an index provides faster read times for queries in SQL. There are three elements that, when combined, can prove an index search to be slower than expected, however. The first regards the leaf node chain. The second is accessing the table. If the hits point to many table blocks, that is a bad sign for performance. The third is fetching the relevant data from table(s). Databases can actually be asked how they use an index. In the oracle database, which has three operations for describing an index lookup, the index range scan, which performs the tree traversal and follows the leaf node chain to find matching entries, is the default fallback operation if multiple potential entries are triggered. It is precisely in this scenario where a "slow index" can manifest.