topic background
08 Sichuan NOI provincial election
Topic description
You are playing your favorite video game and have just entered a bonus level. In this reward level, the system will randomly throw k treasures in sequence, and each time you can choose to eat or not to eat (you must make a choice before throwing the next treasure, and the treasure that you decide not to eat now cannot be eaten again. eat).
There are a total of n kinds of treasures, and the probability of throwing these n kinds of treasures each time is the same and independent of each other. That is to say, even if the system throws treasure 1 for the first k-1 times (this situation is possible, although the probability is very small), the probability of throwing each treasure for the kth time is still 1/n.
Obtaining the i-th treasure will get Pi points, but not every treasure can be obtained at will. The i-th treasure has a prerequisite treasure set Si. Only when all the treasures in Si have been eaten at least once, the i-th treasure can be eaten (if the system throws out a treasure that cannot be eaten at present, it is equivalent to a lost opportunity). Note that Pi can be negative, but if it is the premise of many high-scoring treasures, eating this negative-scoring treasure will gain greater long-term benefits at the expense of short-term benefits.
Assuming you take the optimal strategy, how many points can you get on the reward level on average?
Input and output format
Input format:
The first line contains two positive integers k and n, namely the number and type of treasures. The following n lines describe one
Treasure, in which the first integer represents the score, and the following integers represent each prerequisite treasure of the treasure in turn (each
Treasures are numbered 1 to n), ending with 0.
Output format:
Output a real number, rounded to six decimal places, that is the score of the average case under the optimal policy.
Input and output example
illustrate
1 <= k <= 100, 1 <= n <= 15, the score is an integer in [-106,106].
dp on the subset, and max can be taken for the successor expectation.
#include<bits/stdc++.h> #define ll long long #define D double using namespace std; int ci[25],n,k,pre[25],val[25],now; D f[105][40005],tmp; inline void solve(){ tmp=1/(D)n; for(int i=k-1;i>=0;i--) for(int j=0;j<ci[n];j++) for(int l=0;l<n;l++){ if((pre[l]&j)==pre[l]) f[i][j]+=tmp*max(f[i+1][j|ci[l]]+val[l],f[i+1][j]); else f[i][j]+=tmp*f[i+1][j]; } } int main(){ ci [0] = 1; for(int i=1;i<=20;i++) ci[i]=ci[i-1]<<1; scanf("%d%d",&k,&n); for(int i=0;i<n;i++){ scanf("%d",val+i); while(scanf("%d",&now)==1&&now) pre[i]|=ci[now-1]; } solve(); printf("%.6lf\n",f[0][0]); return 0; }