[SCOI2008] Rewards

topic background

08 Sichuan NOI provincial election

Topic description

You are playing your favorite video game and have just entered a bonus level. In this reward level, the system will randomly throw k treasures in sequence, and each time you can choose to eat or not to eat (you must make a choice before throwing the next treasure, and the treasure that you decide not to eat now cannot be eaten again. eat).

There are a total of n kinds of treasures, and the probability of throwing these n kinds of treasures each time is the same and independent of each other. That is to say, even if the system throws treasure 1 for the first k-1 times (this situation is possible, although the probability is very small), the probability of throwing each treasure for the kth time is still 1/n.

Obtaining the i-th treasure will get Pi points, but not every treasure can be obtained at will. The i-th treasure has a prerequisite treasure set Si. Only when all the treasures in Si have been eaten at least once, the i-th treasure can be eaten (if the system throws out a treasure that cannot be eaten at present, it is equivalent to a lost opportunity). Note that Pi can be negative, but if it is the premise of many high-scoring treasures, eating this negative-scoring treasure will gain greater long-term benefits at the expense of short-term benefits.

Assuming you take the optimal strategy, how many points can you get on the reward level on average?

Input and output format

Input format:

 

The first line contains two positive integers k and n, namely the number and type of treasures. The following n lines describe one

Treasure, in which the first integer represents the score, and the following integers represent each prerequisite treasure of the treasure in turn (each

Treasures are numbered 1 to n), ending with 0.

 

Output format:

 

Output a real number, rounded to six decimal places, that is the score of the average case under the optimal policy.

 

Input and output example

Input example #1: 
1 2
1 0
2 0
Sample output #1: 
1.500000
Input example #2: 
6 6
12 2 3 4 5 0
15 5 0
-2 2 4 5 0
-11 2 5 0
5 0
1 2 4 5 0
Sample output #2: 
10.023470

illustrate

1 <= k <= 100, 1 <= n <= 15, the score is an integer in [-106,106].

 

    dp on the subset, and max can be taken for the successor expectation.

#include<bits/stdc++.h>
#define ll long long
#define D double
using namespace std;
int ci[25],n,k,pre[25],val[25],now;
D f[105][40005],tmp;

inline void solve(){
	tmp=1/(D)n;
	for(int i=k-1;i>=0;i--)
	    for(int j=0;j<ci[n];j++)
	        for(int l=0;l<n;l++){
			    if((pre[l]&j)==pre[l]) f[i][j]+=tmp*max(f[i+1][j|ci[l]]+val[l],f[i+1][j]);
			    else f[i][j]+=tmp*f[i+1][j];
			}
}

int main(){
	ci [0] = 1;
	for(int i=1;i<=20;i++) ci[i]=ci[i-1]<<1;
	scanf("%d%d",&k,&n);
	for(int i=0;i<n;i++){
		scanf("%d",val+i);
		while(scanf("%d",&now)==1&&now) pre[i]|=ci[now-1];
	}
	
	solve();
	
	printf("%.6lf\n",f[0][0]);
	return 0;
}

  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324743428&siteId=291194637