Skip to content

Wrong regression results #112

@pablobaezlinero

Description

@pablobaezlinero

Hi everyone,

So I am trying to plot some Time vs Distance data, together with a polynomical regression of degree 3. The data is the following:

Timestamp (ms) Distance (m)
44156000 0
44954000 826
44991000 863
45501000 1368
45633000 1503
45649000 1531
45667000 1572

Based on Python, I have done this with Python with the following code:

import numpy as np
from sklearn.metrics import r2_score
import matplotlib.pyplot as plt

x = [44156000, 44954000, 44991000, 45501000, 45633000, 45649000, 45667000]
y = [0, 826, 863, 1368, 1503, 1531, 1572]

mymodel = np.poly1d(np.polyfit(x, y, 3))
x_fit = np.linspace(x[0], x[-1], 10000)
plt.scatter(x ,y)
plt.plot(x_fit, mymodel(x_fit), color = 'C1')
plt.title('With direct numbers')
plt.show()

print(f'R^2 = {r2_score(y, mymodel(x))}')
print(f'{mymodel[0]} * x^3 + {mymodel[1]} * x^2 + {mymodel[2]} * x + {mymodel[3]}')

And I get the following plot:

a6bcb483-69c7-4e3c-8388-c67da6be0f84

with the following output:

R^2 = 0.9997873136936645
Equation: -18536197.338221278 * x^3 + 1.2356259030900478 * x^2 + -2.7474751118538763e-08 * x + 2.0378899867907668e-16

But, when trying to do the same with HighCharts, I get this plot:

image

with an R^2 score about 0.5323

The code I have used is the following:

$(function() {
  $('#container').highcharts({
    chart: {
      type: 'scatter',
      zoomType: 'xy'
    },
    title: {
      text: 'Polynomial regression - with extrapolation and different style'
    },
    subtitle: {
      text: 'Source: Heinz  2003'
    },
    xAxis: {
      title: {
        enabled: true,
        text: 'Timestamp (ms)'
      },
      startOnTick: true,
      endOnTick: true,
      showLastLabel: true
    },
    yAxis: {
      title: {
        text: 'Distance (m)'
      }
    },
    legend: {
      layout: 'vertical',
      align: 'left',
      verticalAlign: 'top',
      x: 100,
      y: 70,
      floating: true,
      backgroundColor: '#FFFFFF',
      borderWidth: 1
    },
    plotOptions: {
      scatter: {
        marker: {
          radius: 5,
          states: {
            hover: {
              enabled: true,
              lineColor: 'rgb(100,100,100)'
            }
          }
        },
        states: {
          hover: {
            marker: {
              enabled: false
            }
          }
        },
        tooltip: {
          headerFormat: '<b>{series.name}</b><br>',
          pointFormat: '{point.x} cm, {point.y} kg'
        }
      }
    },
    series: [{
      regression: true,
      regressionSettings: {
        type: 'polynomial',
        color: 'rgba(223, 183, 83, .9)',
        order: 3,
        dashStyle: 'dash'
      },
      name: 'Test input',
      color: 'rgba(223, 83, 83, .5)',
      data: [
      [44156000, 0],
      [44954000, 826],
      [44991000, 863],
      [45501000, 1368],
      [45633000, 1503],
      [45649000, 1531],
      [45667000, 1572]
      ]
    }]
  });
});

Personally, I think that the latter result makes no sense, and the regression fails.

I am doing something wrong? Why does this happen?

Thank you everybody,
Pablo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions