App 5 — Linear Regression Explorer

📋 Dataset — Binary Semiconductors (DFT-PBE Band Gaps)

#	Compound	a (Å)	ΔEN	Val. e⁻	E_g DFT (eV)	Ê_g Pred. (eV)	\|Error\|

⚙️ Feature Selection

Lattice constant a (Å) Electronegativity diff ΔEN Valence electrons

🔒 Regularisation

Type:

📈 Model Performance

R²

–

MAE (eV)

–

RMSE (eV)

–

🧮 Fitted Equation

Select features and fit the model

🔮 Predict a New Material

Lattice constant a (Å) 5.00

Electronegativity diff ΔEN 0.50

Valence electrons 12

Predicted Band Gap

–

🔄 Gradient Descent — Watch the Model Learn

The algorithm starts with all weights = 0 and iteratively adjusts them to minimise the MSE cost. Watch the loss decrease toward the optimal solution.

⚙️ Hyperparameters

Learning rate η 0.0010

Max iterations 200

📊 Current Weights

θ₀ (bias)

0.000

θ₁ (a)

0.000

θ₂ (ΔEN)

0.000

MSE Cost J(θ)

–

📉 Loss Curve — MSE vs. Iteration

▶ Press "Run GD" to start gradient descent...

🎯 Parity Plot — DFT vs. Predicted Band Gap

A perfect model places all points on the diagonal y = x line. Points above the line = underpredicted. Points below = overpredicted.

II-VI oxides / wide gap

III-V nitrides

III-V standard

IV-IV / elemental

Fit the model in the Model tab first.

🔬 Feature Importance — Coefficient Magnitudes

After standardising features (mean=0, std=1), the coefficient magnitudes tell us which features the model relies on most.

📊 Band Gap vs. Lattice Constant — Regression Line

📊 Band Gap vs. ΔEN — Regression Line

🔢 Feature Correlation Matrix

Cell values show Pearson correlation (−1 to +1). High correlation between features (multicollinearity) can destabilise linear regression coefficients.

📝 Quiz — Test Your Understanding

Progress0 / 5

📐 Linear Regression Explorer