Note BU masters, math major and CS minor with other coursework to make me a well rounded human
Utilized Python tools and techniques for statistical, visualization and other data science skills for effective use of data science in a variety of applications including finance, text processing, time series analysis and recommendation systems.
Utilized various data structures to solve computational problems, and implement data structures using a high-level programming language. Algorithms are created, decomposed, and expressed as pseudocode while running time of various algorithms and their computational complexity are analyzed.
Learned the latest relational and object-relational tools and techniques for persistent data and object modeling and management. Extensive hands- on experience using Oracle or Microsoft SQL Server while learning Structured Query Language (SQL), design, and implement databases.
Utilized R for statistical computing and graphics while learning mathematical and practical background required in the field of data analytics. Probability and statistics concepts paired with data summarization techniques and hypothesis testing to build a solid foundation of Data analytics.
Topics include simple linear regression, multiple regression, logistic regression, analysis of variance, and survival analysis to provides an overview of the statistical tools most commonly used to process, analyze, and visualize data. These topics were explored using R, with a focus on understanding how to use and interpret output from this software as well as how to visualize results.
Covered areas of web mining, machine learning fundamentals, text mining, clustering, and graph analytics. This included learning fundamentals of machine learning algorithms, how to evaluate algorithm performance, feature engineering, content extraction, sentiment analysis, distance metrics, fundamentals of clustering algorithms, how to evaluate clustering performance, and fundamentals of graph analysis algorithms.
Discusses basic methods for designing and analyzing efficient algorithms emphasizing methods used in practice. Topics include sorting, searching, dynamic programming, greedy algorithms, advanced data structures, graph algorithms (shortest path, spanning trees, tree traversals), matrix operations, string matching,
This course is an introduction to large-scale data analytics. Big Data analytics is the study of how to extract actionable, non-trivial knowledge from massive amount of data sets. This class will focus both on the cluster computing software tools and programming techniques used by data scientists, as well as the important mathematical and statistical models that are used in learning from large-scale data processing. On the tools side, we will cover the basics systems and techniques to store large-volumes of data, as well as modern systems for cluster computing based on Map-Reduce pattern such as Hadoop MapReduce, Apache Spark and Flink. Students will implement data mining algorithms and execute them on real cloud systems like Amazon AWS, Google Cloud or Microsoft Azure by using educational accounts. On the data mining models side, this course will cover the main standard supervised and unsupervised models and will introduce improvement techniques on the model side.
This course aims to study basic concepts and techniques of data mining. The topics include data preparation, classification, performance evaluation, association rule mining, ?regressions and clustering. We will discuss basic data mining algorithms in the class, and students will practice data mining techniques using Python or R.
This course covers advanced aspects of database management including normalization and denormalization, query optimization, distributed databases, data warehousing, and big data. There is extensive coverage and hands on work with SQL, and database instance tuning. Course covers various modern database architectures including relational, key value, object relational and document store models as well as various approaches to scale out, integrate and implement database systems through replication and cloud based instances. Students learn about unstructured "big data" architectures and databases, and gain hands-on experience with Spark and MongoDB. Students complete a term project exploring an advanced database technology of their choice.