Open in app

Sign In

Write

Sign In

Vijay Patil
Vijay Patil

30 Followers

Home

About

Published in Analytics Vidhya

·Pinned

Predictive models using Rolling Window Features (I)

Rolling Window When building a predictive model, often times the ask is to predict what will happen next, or what will happen in next X days or X weeks. The model, and required features + dependent variable, needs to be designed to accommodate the relative time element. Following article gives a walkthrough…

Pyspark

6 min read

Predictive models using Rolling Window Features (I)
Predictive models using Rolling Window Features (I)
Pyspark

6 min read


Published in Analytics Vidhya

·Sep 3, 2022

XGBoost with PySpark on AWS EMR

Compatible versions to train and deploy XGBoost using PySpark — Following article gives a walkthrough of the steps to be taken to use XGBoost with PySpark on AWS EMR. XGBoost does not provide a PySpark API in Spark, it only provides Scala and other APIs. Hence we will be using a custom python wrapper for XGBoost from this PR. We…

Spark

4 min read

XGBoost with PySpark on AWS EMR
XGBoost with PySpark on AWS EMR
Spark

4 min read


Mar 20, 2022

Installing and using PySpark on Linux machine

Installation steps simplified — Below steps have been tried on WSL on a Windows 10 laptop, with two different Spark versions (2.4.5 and 3.1.2). I have used WSL but the steps will work on Ubuntu machine as well. Installing Prerequisites PySpark requires Java version 7 or later and Python version 2.6–3.7 for Spark 2.x.x and Python…

Spark

5 min read

Installing and using PySpark on Linux machine
Installing and using PySpark on Linux machine
Spark

5 min read


Mar 13, 2022

Product Recommendation Strategies

A non-exhaustive, but fairly large list of product recommendation strategies in Fashion Retail (ecommerce) Personalization Personalization consists of tailoring a service or a product to accommodate specific individuals, or groups/segments of individuals. Almost every organization today uses, or wants to use, personalization to improve customer satisfaction and sales. When it comes…

Product Recommendations

8 min read

Product Recommendation Strategies
Product Recommendation Strategies
Product Recommendations

8 min read


Published in Analytics Vidhya

·Jan 9, 2022

Predictive models using Rolling Window Features (II)

Part 2 of the Rolling Window approach series. Quick Recap When building a predictive model, often times the ask is to predict what will happen next, or what will happen in next X days or X weeks. …

Pyspark

6 min read

Predictive models using Rolling Window Features (II)
Predictive models using Rolling Window Features (II)
Pyspark

6 min read


Published in Analytics Vidhya

·Sep 19, 2021

Getting started with Airflow

How-to guide on setting up Airflow on Linux machine and creating a basic workflow using BashOperator, PythonOperator and MySqlOperator — Home No more command-line or XML black-magic! Use standard Python features to create your workflows, including date time…airflow.apache.org Airflow is a platform created by the community to programmatically author, schedule and monitor workflows. Airflow is a powerful tool when it comes to deploying and monitoring workflows. It uses standard Python features to create workflows, including date time formats for scheduling and loops to dynamically generate tasks. …

Airflow

8 min read

Getting started with Airflow
Getting started with Airflow
Airflow

8 min read


Published in Analytics Vidhya

·Jul 31, 2021

Clustering and profiling customers using k-Means

Following article walks through the flow of a clustering exercise using customer sales data. It covers following steps: Conversion of input sales data to a feature dataset that can be used for clustering Performing clustering exercise Profiling the clusters, and Setting up a regular scoring process to assign cluster labels…

Cluster Analysis

9 min read

Clustering and profiling customers using k-Means
Clustering and profiling customers using k-Means
Cluster Analysis

9 min read


Published in Analytics Vidhya

·Dec 22, 2020

Installing and using PySpark on Windows machine

Installation steps simplified (and automated to certain extent…) — Below steps have been tried on 2 different Windows 10 laptops, with two different Spark versions (2.x.x) and with Spark 3.1.2. Installing Prerequisites PySpark requires Java version 7 or later and Python version 2.6 or later. Java To check if Java is already available and find it’s version, open a Command Prompt…

Pyspark

6 min read

Installing and using PySpark on Windows machine
Installing and using PySpark on Windows machine
Pyspark

6 min read

Vijay Patil

Vijay Patil

30 Followers

Data Analytics consultant

Following
  • Dariusz Gross #DATAsculptor

    Dariusz Gross #DATAsculptor

  • Cassie Kozyrkov

    Cassie Kozyrkov

  • McKinsey Digital

    McKinsey Digital

  • Shanaka Chathuranga

    Shanaka Chathuranga

  • Jack Harding

    Jack Harding

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech