Inhalt: Developed at LinkedIn, Apache Kafka is a distributed streaming platform that provides scalable, high-throughput messaging systems in place of traditional messaging systems like JMS. In this course, examine all the core concepts of Kafka. Ben Sullins kicks off the course by making the case for Kafka, and explaining who's using this efficient platform and why. He then shares Kafka workflows to provide context for core concepts, explains how to install and test Kafka locally, and dives into real-world examples. By the end of this course, you'll be prepared to achieve scalability, fault tolerance, and durability with Apache Kafka. Umfang: 01:20:48.00
Inhalt: Approach big data with confidence by mastering the core skills needed to put data to work for your business. This course covers the basics of data engineering, system design, analytics, and business intelligence. Data science expert Ben Sullins explains how to collect and organize your data so you can deliver results that your organization can leverage. Ben starts by examining the modern data ecosystem and how it relates to running a smart and efficient data hub. Then, he shows you how to perform the principle tasks involved in managing, loading, extracting, and transforming data. He also takes you through staging, profiling, cleansing, and migrating data. Along the way, he provides actionable recommendations that applicable to data experts throughout an organization-analysts, engineers, scientists, modelers, and more. Umfang: 00:53:24.00
Inhalt: Python is a popular programming language in the field of data science, used for engineering and analytics, as well as data science itself. In this course, instructor Ben Sullins focuses on teaching you how to use Python using notebooks set up on the Jupyter platform. Ben walks through installing Jupyter using Anaconda, then shows you how to navigate the user interface and get Python running on a notebook in Jupyter. He covers how to import pandas, use pandas to explore sample data, and use data frames. Ben steps you through a number of functions, then concludes with some of the unique ways to present your data using Jupyter notebooks and Python. Note: This course was created by Ben Sullins and Free the Data Academy. We are pleased to host this training in our library. Umfang: 00:27:57
Inhalt: Businesses thrive by making informed decisions that target the needs of their customers and users. To make such strategic decisions, they rely on data. Hive is a tool of choice for many data scientists because it allows them to work with SQL, a familiar syntax, to derive insights from Hadoop, reflecting the information that businesses seek to plan effectively. This course shows how to use Hive to process data. Instructor Ben Sullins starts by showing you how to structure and optimize your data. Next, he explains how to get Hue, the Hadoop user interface, to leverage HiveQL when analyzing data. Using the newly configured option, he then demonstrates how to load data, create aggregate tables for fast query access, and run advanced analytics. He also takes you through managing tables and putting functions to use. This course is designed to help you find new ways to work with datasets so you can answer the tough data science questions that come your way. Umfang: 01:53:06.00
Inhalt: Modern work in data science requires skilled professionals versed in analysis workflows and using powerful tools. Python can play an integral role in nearly every aspect of working with data-from ingest, to querying, to extracting and visualizing. This course highlights twelve tips and tricks you can put into practice to improve your skills in Python. These techniques are readily applied and in common data management tasks and include the following: how to ingest data using CSV, JSON, and TXT files; how to explore data using libraries like Pandas; how to organize and join data using DataFrames; how to create charts and graphic representations of data using ggplot in Python; and more. Umfang: 00:47:46.00
Inhalt: In this course, Ben Sullins debunks 12 common misconceptions within the field of data science. Busy engineers, data miners, programmers, and other systems specialists who want to bolster their skills can benefit from Ben's succinct, practical insights. Separate data science fact from fiction, and learn what big data actually is, and why-contrary to what media coverage often suggests-it's not a singular thing. Ben also explains why big data can't instantly yield great insights, how to make analytics clearer, when to replace your relational databases, and more. Umfang: 00:36:05.00
Inhalt: Netflix and Airbnb both use Presto-an open-source SQL query engine developed by Facebook-for their ever-expanding big data querying needs. In this course, learn how to harness the power of your big data system using the Presto platform, which breaks the false dilemma of having to choose between an expensive commercial solution that offers fast analytics, and a slow, ostensibly free solution that requires excessive hardware. Data science expert Ben Sullins helps you get up to speed with Presto, and leverage it to accomplish a wide-range of data science and analytics tasks. He uses different interfaces with Presto-such as R and Tableau-and digs into the expressive SQL language that Presto offers for your analysis. At the end of this course, you'll know the key concepts of Presto and how to use them to take full advantage of your modern big data system. Umfang: 01:48:51.00
Inhalt: Apache HBase is the Hadoop database-a NoSQL database management system that runs on top of HDFS (Hadoop Distributed File System). Like Hadoop, HBase is an open-source, distributed, versioned, column-oriented store. Companies such as Facebook, Adobe, and Twitter are using HBase to facilitate random, real-time read/write access to big data. Any data scientist or database engineer who wants a job at these top-tier organizations needs to master HBase to make it in the door. This course can help professionals further their career in big data analytics using HBase and the Hadoop framework. Learn to describe HBase in the context of the NoSQL landscape, build simple architecture models, and explore basic HBase commands. Instructor Ben Sullins shows how all the concepts fit together, resulting in the kind of distributed big data storage you need for scalable, enterprise-level applications. Umfang: 01:20:07.00
Inhalt: If you're interested in working in the field of data or looking to advance in the field, you need a foundational knowledge of several key areas of data science. Not only that, you need to be able to demonstrate that knowledge. In this four-part, hands-on series, Ben Sullins shows how to build four distinct data science projects using SQL, Tableau, Python, and Spark. In this first installment, Ben uses SQL to analyze employee data, which is notoriously difficult to analyze given its structure. He breaks down the specific structure of employee data, and the best way to track this kind of information, then covers how you can start answering specific questions utilizing SQL. Finally, he gives advice on how to present your data, considering both your audience and the visuals you use, in order to convey your knowledge of the subject. Note: This course was created by Free the Data Academy. We are pleased to host this training in our library. Umfang: 00:44:56
Inhalt: If you're interested in working in data or looking to advance in the field, you need a foundational knowledge of several key areas of data science. Not only that, you need to be able to demonstrate that knowledge. In this four-part series, Ben Sullins shows how to build four distinct data science projects using SQL, Tableau, Python, and Spark. In this second installment, Ben details the steps in building a sales dashboard with Tableau, the popular data visualization platform favored by organizations worldwide. Ben starts by breaking down the different aspects of Tableau, from working on a desktop, to sharing data over the web, to using the Tableau Public platform to publicly share your data visualizations. He then shows Tableau in action, looking at how it facilitates a deep dive into your data, before demonstrating how to build out an exploratory dashboard with your data. At the end of the course, you'll be able to give a live demo of your data visualizations in Tableau Public on the web. Note: This course was created by Free the Data Academy. We are pleased to host this training in our library. Umfang: 01:01:15
Inhalt: R is known as one of the most robust statistical computing solutions out there. Tableau-a leading business intelligence platform-provides excellent data visualization and exploration capabilities. When combined, Tableau and R offer one of the most powerful and complete data analytics solutions in the industry today, providing businesses with unparalleled abilities to see and understand their data. In this course, learn how to integrate these two platforms, as well as how to determine when each one is a better choice. Instructor Ben Sullins explains how to connect Tableau to R, and covers geocoding, running linear regression models, clustering, and more. Umfang: 01:10:38.00
Inhalt: Looker-a powerful data analytics platform-can help both large and small companies glean value from their data. In this short course, get up to speed with Looker, and learn how to leverage this platform to make collecting, visualizing, and analyzing data a bit easier. Ben Sullins begins by explaining how and why Looker is used, and exploring the Looker ecosystem. He also dives into how Looker organizes its data using LookML, how to visualize data in the Looker platform, and how to create a web-based dashboard. Umfang: 00:47:49.00
Inhalt: Are you considering working in data science, and would you like to try out some popular tools first? This course focuses on what you can do with Apache Spark. Instructor Ben Sullins shows you how to set up Spark on Databricks. Ben goes over how to import your data and start working with it, using both Python and SQL languages in Spark. He steps through taking your project to the next level with some easy data visualizations. After explaining some tips and tricks to present your data using Spark, Ben concludes with some additional resources that you can use to pursue your data science journey. Note: This course was created by Ben Sullins and Free the Data Academy. We are pleased to host this training in our library. Umfang: 00:29:27
Inhalt: Hadoop-the hugely popular big data platform-offers a vast array of capabilities designed to help data scientists deliver their insights. In this course, Ben Sullins helps you get up to speed with Hadoop by sharing a series of tips and tricks for doing data science work in this powerful platform. He starts by looking at how to work with Hadoop data in HDFS, and then explores using Hive-the Hadoop SQL engine-where a lot of data science work happens. To wrap up the course, Ben covers techniques for running fast queries in the Hive engine. Umfang: 01:12:30.00
Inhalt: If you're looking for work as a junior data analyst, engineer, or scientist, this course gives you the best techniques to land jobs in data science. Instructor Ben Sullins explores a hiring manager's mindset and shows you how to prepare a demo that you can bring to your job interview. Ben explains some best practices for being physically and mentally healthy and well-prepared for your interview. He covers what to bring to your interview, then goes over follow-up methods and steps you should take to negotiate what you want most from an offer. Ben's methods give you the confidence and practice that you need to land a job in data science. Note: This course was created by Ben Sullins and Free the Data Academy. We are pleased to host this training in our library. Umfang: 00:31:57
Inhalt: Get Ben Sullins's 12 must-have SQL techniques for data science pros-engineers, DevOps, data miners, programmers, and other systems specialists. Ben's tips focus on practical applications of SQL queries for data analysis. Learn how to retrieve data, join tables, calculate rolling averages and rankings, work with dates and times, use window functions, aggregate and filter data, and much more. Each tip is short, relevant, and up to date with current industry best practices-making this the perfect course for busy analysts who normally struggle to find time to build their skills. Umfang: 00:59:23.00
Inhalt: Apache Cassandra is a NoSQL database capable of handling large amounts of data that change rapidly. In this course, learn about the architecture of this popular database, and discover how to design Cassandra data models that support scalable applications. Dan Sullivan highlights the differences between Cassandra and relational databases, discusses the Cassandra Query Language (CQL), and shows techniques for modeling based on application query requirements. He also dives into Cassandra implementation details that impact data modeling choices, to help you reason through other design decisions while taking into account the database's architecture and limitations. Umfang: 01:38:14.00
Inhalt: Many data scientists know how to work with SQL-the industry-standard language for data analysis. But as data sizes grow, you need to know how to do more than simply read and write from a database. This course provides a more sophisticated approach to designing data models and optimizing queries in SQL. Instructor Dan Sullivan begins with the logical and physical design of tables-with particular focus on very large databases-and then presents a deep dive review of indexes, including specialized indexes and when to use them. The next section introduces query optimization and shows how to optimize basic, multi-join, and more complex queries. The course also covers SQL extensions, including user-defined functions and specialized data types. The techniques taught here enable more efficient analysis of large data sets using SQL, statistics, and custom business logic. Umfang: 02:30:38
Inhalt: There is an increasing need for data scientists and analysts to understand relational data stores. Organizations have long used SQL databases to store transactional data as well as business intelligence related data. If you need to work with SQL databases, this course is designed to help you learn how to perform common data science tasks, including finding, exploration, and extraction within relational databases. The course begins with a brief overview of SQL. Then the five major topics a data scientist should understand when working with relational databases: basic statistics in SQL, data preparation in SQL, advanced filtering and data aggregation, window functions, and preparing data for use with analytics tools. Umfang: 01:24:09.00
Inhalt: Discover how to leverage Scala-the popular language that combines object-oriented design with functional programming-in your data science work. In this course, learn about the Scala features most useful to data scientists, including custom functions, parallel processing, and programming Spark with Scala. Dan Sullivan kicks off the course with an introduction for non-Scala programmers. Next, he describes how to use SQL from Scala-a particularly useful concept for data scientists, since they often have to extract data from relational databases. He then covers parallel processing constructs in Scala, sharing techniques that are useful for medium-sized data sets that can be analyzed on a single server with multiple cores. Dan also focuses on using Scala with Spark, a distributed processing platform. He first describes how to work with Resilient Distributed Datasets (RDDs)-a fundamental Spark data structure-and then explains how to use Scala with Spark DataFrames, a new class of data structure specially designed for analytic processing. He wraps up the course by providing a summary of advantages of using Scala for data science. Umfang: 01:51:32.00
Inhalt: There is an increasing need for data scientists and analysts to understand relational data stores. Organizations have long used SQL databases to store transactional data as well as business intelligence related data. This course was designed for data scientists who need to work with SQL databases. Specifically, it was designed to help these professionals learn how to perform common data science tasks, including exploration and extraction of data within relational databases. Instructor Dan Sullivan kicks off the course with a brief overview of SQL data manipulation and data definition commands. He then focuses on how to use SQL queries to prepare data for analysis; leverage statistical functions to better understand that data; and work with aggregates, window operations, and more. Umfang: 02:38:49
Inhalt: Explore DataFrames, a widely used data structure in Apache Spark. DataFrames allow Spark developers to perform common data operations, such as filtering and aggregation, as well as advanced data analysis on large collections of distributed data. With the addition of Spark SQL, developers have access to an even more popular and powerful query language than the built-in DataFrames API. In this course, instructor Dan Sullivan shows how to perform basic operations-loading, filtering, and aggregating data in DataFrames-with the API and SQL, as well as more advanced techniques that are easily performed in SQL. In this section of the course, Dan explains how to join data, eliminate duplicates, and deal with null or NA values. The lessons conclude with three in-depth examples of using DataFrames for data science: exploratory data analysis, time series analysis, and machine learning. Umfang: 01:53:25.00
Inhalt: Time series data is data gathered over time: performance metrics, user interactions, and information collected by sensors. Since different time series data have different measures and different intervals, these data present a unique challenge for data scientists. However, SQL has some features designed to help. This course teaches you how to standardize and model time series data with them. Instructor Dan Sullivan discusses windowing and the difference between sliding and tumbling window calculations. Then learn how SQL constructs such as OVER and PARTITION BY help to simplify analysis, and how denormalization can be used to augment data while avoiding joins. Plus, discover optimization techniques such as indexing. Dan also introduces time series analysis techniques such as previous time period comparisons, moving averages, exponential smoothing, and linear regression. Umfang: 01:18:52.00
Inhalt: SQL queries can be fast and highly efficient, but they can also be slow and demand excessive CPU and memory resources. For many SQL programmers, occasional bouts with long-running queries and poor performance are simply par for the course. But by gaining a better understanding of how databases translate SQL queries into execution plans, you can take steps to avoid these issues. In this course, Dan Sullivan shows developers how to analyze query execution plans and use data modeling strategies to boost query performance. Dan describes how SQL queries are executed; highlights different types of indexes and how they factor in query tuning; covers several methods for performing joins; and discusses how to use partitioning and materialized views to improve performance. Umfang: 01:44:39.00
Inhalt: Machine learning models often run in complex production environments that can adapt to the ebb and flow of big data. The tools and practices that help data scientists rapidly build machine learning models are not sufficient to deploy those models at scale. To deliver scalable solutions, you need a whole new toolset. This course provides data scientists and DevOps engineers with an overview of common design patterns for scalable machine learning architectures, as well as tools for deploying and maintaining machine learning models in production. Instructor Dan Sullivan reviews three technologies that enable scalable machine learning: services that expose models through APIs, containers for deploying models, and orchestration tools like Kubernetes that help manage containers and clusters. Plus, get tips for monitoring the performance of your services in production environments. Umfang: 01:43:10.00
Programm Findus Internet-OPAC findus.pl V20.235/8 auf Server windhund2.findus-internet-opac.de,
letztes Datenbankupdate: 26.04.2024, 12:10 Uhr. 4.187 Zugriffe im April 2024. Insgesamt 509.702 Zugriffe seit Januar 2009
Mobil - Impressum - Datenschutz - CO2-Neutral