问题描述
我有一个看起来像这样的图表:
I have a graph that looks like this:
我正在运行以获取此图(8 个图序列中的一个)的代码如下:
And the code I'm running to get this graph (one of a sequence of 8 graphs) is below:
date_list = list(df_testing_set['date'].unique()) random_date_list = list(np.random.choice(date_list,8)) df_new = df_testing_set[df_testing_set['date'].isin(random_date_list)] for date1 in random_date_list: df_new = df_testing_set[df_testing_set['date'] == date1] title = date1 if df_new.iloc[0]['day'] in ['Saturday', 'Sunday']: df_shader = df_result_weekend.copy() title += " - Weekend" else: df_shader = df_result_weekday.copy() title += " - Weekday" y = df_new[row_index].tolist() x = range(0, len(y)) x_axis = buckets y_axis = df_shader.loc[df_shader.index.isin([row_index]) & df_shader['Bucket'].between(1, 144), data_field].tolist() del y_axis[-1] plt.title(title) plt.xlabel("Time of Day (10m Intervals)") plt.ylabel(data_field + " values for " + row_index) standevs = df_shader.loc[df_shader.index.isin([row_index]) & df_shader['Bucket'].between(1, 144), 'StanDev'].tolist() del standevs[-1] lower_bound = np.array(y_axis) - np.array(standevs) upper_bound = np.array(y_axis) + np.array(standevs) plt.fill_between(x_axis, lower_bound, upper_bound, facecolor='lightblue') #highlighting anomalies # if (y > upper_bound | y < lower_bound): # plt.plot(x,y, 'rx') # else: # plt.plot(x, y) plt.plot(x,y) plt.show() del df_shader, title, date1, df_new
我正在尝试创建一个条件(如注释的 if 语句),以便当绘制的坐标高于阈值 upper_bound 或低于 lower_bound 时,点是标有不同颜色的x".我希望最终拥有它,如果一个点超过阈值 1 个标准偏差,它将被标记为橙色,如果超过 2 个或更多标准偏差,它将被标记为红色.我在 StanDev 列下的数据框 df_shader 中有所有标准偏差.每当我尝试运行 if 块的某些变体时,都会出现变量错误和名称错误
I'm trying to create a condition (like the commented if statement) such that when the plotted coordinates go above the threshold upper_bound or below lower_bound, the points are marked with an 'x' in different colors. I want to in the end have it such that if a point exceeds the threshold by 1 standard deviation, it will be marked in orange, and if it exceeds by 2 or more standard deviations, it will be marked in red. I have all the standard deviations in the data frame df_shader under the column StanDev. Whenever I try to run some variation of the if-block, I get variable errors and name errors
推荐答案
您可以使用布尔掩码选择满足某些条件的点,并绘制它们:
You can use boolean masks to select points that fulfill certain conditions, and plot them:
import matplotlib.pyplot as plt import numpy as np std = 0.1 N = 100 x = np.linspace(0, 1, N) expected_y = np.sin(2 * np.pi * x) y = expected_y + np.random.normal(0, std, N) dist = np.abs(y - expected_y) / std mask1 = (1 < dist) & (dist <= 2) mask2 = dist > 2 plt.fill_between(x, expected_y - 0.1, expected_y + 0.1, alpha=0.1) plt.fill_between(x, expected_y - 0.2, expected_y + 0.2, alpha=0.1) plt.plot(x, y) plt.plot(x[mask1], y[mask1], 'x') plt.plot(x[mask2], y[mask2], 'x') plt.tight_layout() plt.savefig('mp_points.png', dpi=300)
结果: