Q11: Explain Decision Tree algorithm with the help of a suitable example.

Introduction

Decision Tree is a popular supervised learning algorithm used for both classification and regression tasks. It uses a tree-like structure where each internal node represents a test on a feature, each branch represents an outcome of the test, and each leaf node represents a class label or output.

Key Concepts

Root Node: The starting point of the tree.
Decision Node: Represents a condition or test.
Leaf Node: Represents the result (class/label).
Splitting: The process of dividing a node into child nodes based on some feature.

How Decision Tree Works

It recursively selects the best feature that splits the data based on some criterion, usually:

Gini Index (used in Classification)
Information Gain (Entropy)
Variance Reduction (for Regression)

Example

Let’s consider a simple example of predicting whether a person will play tennis based on weather conditions:

Outlook	Humidity	Wind	Play Tennis
Sunny	High	Weak	No
Sunny	High	Strong	No
Overcast	High	Weak	Yes
Rain	High	Weak	Yes
Rain	Normal	Weak	Yes
Rain	Normal	Strong	No

Based on this dataset, the decision tree may look like this:

If Outlook = Overcast → Play Tennis = Yes
If Outlook = Sunny and Humidity = High → No
If Outlook = Sunny and Humidity = Normal → Yes
If Outlook = Rain and Wind = Weak → Yes
If Outlook = Rain and Wind = Strong → No

Advantages

Easy to interpret and visualize
No need for feature scaling
Can handle both categorical and numerical data

Disadvantages

Prone to overfitting
Can be biased if not properly pruned

Applications

Customer segmentation
Loan approval systems
Medical diagnosis

Conclusion

Decision Trees are powerful yet simple tools for predictive modeling. They work well with small to medium datasets and provide a transparent decision-making process. However, in practical use, ensemble methods like Random Forest or Gradient Boosted Trees are preferred for better performance.