wzatv:一文读懂遗传算法工作原理（附Python实现）(4)_本港台直播_J2开奖直播

train[ 'Item_Weight'].fillna((train[ 'Item_Weight'].mean()), inplace= True)test[ 'Item_Weight'].fillna((test[ 'Item_Weight'].mean()), inplace= True)

### reducing fat content to only two categories

train[ 'Item_Fat_Content'] = train[ 'Item_Fat_Content'].replace([ 'low fat', 'LF'], [ 'Low Fat', 'Low Fat']) train[ 'Item_Fat_Content'] = train[ 'Item_Fat_Content'].replace([ 'reg'], [ 'Regular']) test[ 'Item_Fat_Content'] = test[ 'Item_Fat_Content'].replace([ 'low fat', 'LF'], [ 'Low Fat', 'Low Fat']) test[ 'Item_Fat_Content'] = test[ 'Item_Fat_Content'].replace([ 'reg'], [ 'Regular']) train[ 'Outlet_Establishment_Year'] = 2013- train[ 'Outlet_Establishment_Year'] test[ 'Outlet_Establishment_Year'] = 2013- test[ 'Outlet_Establishment_Year'] train[ 'Outlet_Size'].fillna( 'Small',inplace= True)test[ 'Outlet_Size'].fillna( 'Small',inplace= True)train[ 'Item_Visibility'] = np.sqrt(train[ 'Item_Visibility'])test[ 'Item_Visibility'] = np.sqrt(test[ 'Item_Visibility'])col = [ 'Outlet_Size', 'Outlet_Location_Type', 'Outlet_Type', 'Item_Fat_Content']test[ 'Item_Outlet_Sales'] = 0combi = train.append(test) fori incol: combi[i] = number.fit_transform(combi[i].astype( 'str')) combi[i] = combi[i].astype( 'object')train = combi[:train.shape[ 0]]test = combi[train.shape[ 0]:]test.drop( 'Item_Outlet_Sales',axis= 1,inplace= True)

## removing id variables

tpot_train = train.drop([ 'Outlet_Identifier', 'Item_Type', 'Item_Identifier'],axis= 1)tpot_test = test.drop([ 'Outlet_Identifier', 'Item_Type', 'Item_Identifier'],axis= 1)target = tpot_train[ 'Item_Outlet_Sales']tpot_train.drop( 'Item_Outlet_Sales',axis= 1,inplace= True)

# finally building model using tpot library

fromtpot importTPOTRegressorX_train, X_test, y_train, y_test = train_test_split(tpot_train, target, train_size= 0.75, test_size= 0.25)tpot = TPOTRegressor(generations= 5, population_size= 50, verbosity= 2)tpot.fit(X_train, y_train)print(tpot.score(X_test, y_test))tpot.export( 'tpot_boston_pipeline.py')

wzatv:一文读懂遗传算法工作原理（附Python实现）

一旦这些代码运行完成，tpot_exported_pipeline.py 里就将会放入用于路径优化的 python 代码。我们可以发现，ExtraTreeRegressor 可以最好地解决这个问题。

## predicting using tpot optimised pipeline

tpot_pred = tpot.predict(tpot_test)sub1 = pd.DataFrame(data=tpot_pred)

#sub1.index = np.arange(0, len(test)+1)

sub1 = sub1.rename(columns = { '0': 'Item_Outlet_Sales'})sub1[ 'Item_Identifier'] = test[ 'Item_Identifier']sub1[ 'Outlet_Identifier'] = test[ 'Outlet_Identifier']sub1.columns = [ 'Item_Outlet_Sales', 'Item_Identifier', 'Outlet_Identifier']sub1 = sub1[[ 'Item_Identifier', 'Outlet_Identifier', 'Item_Outlet_Sales']]sub1.to_csv( 'tpot.csv',index= False)

如果你提交了这个 csv，那么你会发现我一开始保证的那些还没有完全实现。那是不是我在骗你们呢？当然不是。实际上，TPOT 库有一个简单的规则。如果你不运行 TPOT 太久，那么它就不会为你的问题找出最可能传递方式。

所以，你得增加进化的代数，atv，拿杯咖啡出去走一遭，其它的交给 TPOT 就行。此外，你也可以用这个库来处理分类问题。进一步内容可以参考这个文档：。除了比赛，在生活中我们也有很多应用场景可以用到遗传算法。

6、实际应用

遗传算法在真实世界中有很多应用。这里我列了部分有趣的场景，但是由于篇幅限制，我不会逐一详细介绍。

6.1 工程设计

工程设计非常依赖计算机建模以及模拟，这样才能让设计周期过程即快又经济。遗传算法在这里可以进行优化并给出一个很好的结果。