Prerequisites for SCI writing - understanding the essence of the paper

A paper can be thought of as a collection of evidence and supporting explanations; that is, an attempt to persuade others to share your conclusions

1. Hypothesis

In the outline, a sample research procedure might proceed as follows.
● Researchers studying algorithms may speculate whether it is possible to make better use of caches on the CPU to reduce computational costs.
● Preliminary studies may lead to the hypothesis that, despite the additional computational cost, tree-based structures with poor memory locality outperform in practice Array-based structures are slow.
● This hypothesis raises the research question of whether a specific sorting algorithm can be improved by replacing a tree structure with an array structure.
● If the hypothesis is correct, the phenomenon that should be observed is a trend: for example, as the number of items to be sorted increases, the tree-based method should Increasingly showing higher cache miss rate.
● The evidence is the number of cache misses for the several sets of items to be sorted. Alternatively, external evidence can be used, such as changes in execution time as data volume changes
As this example illustrates, the structure of a research proposal comes from having a clear research question and hypothesis.
It is important to say what is not being said—what the limitations of the conclusions are.

Consider an example. Suppose the P-list is a well-known data structure used in a range of applications, particularly as a fast and compact in-memory search structure. A scientist developed a new data structure called Q-list. Formal analysis showed that both structures have the same asymptotic complexity in space and time, but the scientist intuitively believed that the Q-list was superior in practice and decided to prove this experimentally.

This belief, or instinctive motivation, is a key factor in the scientific process: since ideas are unlikely to be correct when first conceived, intuition or rationality suggests they are worth considering. That is, the investigation was likely conducted for subjective reasons; but the final report of the research—the published paper—must be objective.

Continuing the above example, the hypothesis can be summarized as:
×: Q list is better than P list

But this statement is not sufficient as a basis for experimentation: success must apply to all applications, under all conditions, and at all times. Formal analysis might be able to justify such a result, but no experiment would be so far-reaching. In any case, it's rare that a data structure is completely replaced - given the persistence of arrays and linked lists - so this assumption is likely incorrect. A testable hypothesis might be:

√: As an in-memory search structure for large data sets, Q-list is faster and more compact than P-list.

Further qualifications may be necessary:

√: We assume that there is a special access pattern, that is, most accesses will only access a small part of the data.

The qualification statement imposes a scope on the declaration of the q list. For the hypothesis, there is sufficient information to reasonably conclude that the q-table is not suitable for a certain application;
This restriction does not invalidate the result but rather strengthens it by making it more precise . Another scientist is free to explore how q-lists behave under another set of conditions, where they may not be as good as p-lists but where the original hypothesis still holds.

As this example illustrates, a hypothesis must be testable. One aspect of testability is limiting the scope to areas that can be explored.
Another crucial aspect is that the hypothesis should be falsifiable. Vague claims are unlikely to meet this standard.
Q-list performance is comparable to P-list performance.

One form of research in which poor assumptions seem to be particularly common is "black box" work, i.e. a black box is an algorithm whose properties little is known about. For example, some studies include applying black box learning algorithms to new data, with the result being improvements over baseline methods. (Typically, this statement is something along the lines of "Our black boxes are much better than random.") The apparent ability of these black boxes to solve problems without the creative input of scientists attracts low-value research. One weakness of this type of research is that it does not provide insight into the data or black box and has no implications for other investigations. In particular, such results tell us little about whether the same behavior would occur if the same method was applied to a different situation, or even to a new but similar data set. That is, the results are not predictable. In some cases it can be interesting to observe the behavior of an algorithm on some data, but usually the purpose of an experiment is to confirm a model or theory that can then be used to predict future behavior. That is, we use experiments to understand more general properties that are missing from black-box studies.

However, this assumption should not be followed experimentally. A hypothesis is usually based on observations, but is considered confirmed only if successful predictions can be made. There is a gap between observations like "the algorithm works on our data" and tested hypotheses like "the algorithm was predicted to work on this type of data, and this prediction has been confirmed on our data" A huge difference. Another perspective on this issue is that testing should be as blind as possible. If an experiment and hypothesis fine-tune the data, the experiment cannot be said to provide confirmation. At best, this experiment provides the observations on which the hypothesis is based. In other words: Hypothesize first, then test.

If two hypotheses are equally consistent with observations and one is significantly simpler than the other, the simpler hypothesis should be chosen. This principle is called Occam's razor, purely for convenience; but it is well established that there is no reason to choose a complex explanation when another explanation is available.

2. Justification hypothesis

One component of a strong paper is a precise and interesting hypothesis; another component is the testing of the hypothesis and the presentation of supporting evidence.
As part of the research process, you need to test your hypothesis and, if it is correct—or, at least, not disproven—gather supporting evidence.
When formulating a hypothesis, you need to construct an argument that links your hypothesis to the evidence.

For example, the hypothesis that "the new range search method is faster than the previous method" may be supported by the evidence that "a range search between n elements requires [2log (log n) + c] times of comparisons." This may or may not be good evidence, but it is not convincing because there is noargumentWillevidenceandhypothesisGet in touch. What is missing is information such as "results from previous methods indicate that the asymptotic cost is [log n]", connectargumentThe role is to proveThe evidence does support the hypothesis, and prove that the conclusion is correct.

When constructing an argument, it can be helpful to imagine yourself defending your hypothesis to a colleague, so that you can play the role of prosecutor. That is, raising objections and defending yourself is a way of gathering the material you need to convince the reader that your argument is correct. Starting from the assumption that "the new string hashing algorithm is fast because it doesn't use multiplication or division", you might argue as follows: . . .

In an argument, you need to refute possible objections while acknowledging points that are irrefutable and points that you are unsure of. If, in the process of developing your hypothesis, you raise an objection, but reasoning about it would be valuable, include that reasoning in the paper. Doing so allows readers to follow your lead and greatly assists readers who may independently raise the same objection. That said, you need to anticipate any concerns your readers may have about your assumptions. Likewise, you should actively look for counterexamples.

If you think of an objection that you can't refute, don't put it aside.
At least you should ask this question yourself in the paper, but this will most likely mean you have to reconsider your results.

Always consider the possibility that your assumptions are wrong. Often, a correct hypothesis sometimes looks dubious—perhaps in its early stages, before it has been fully developed, or when it appears to contradict initial experimental evidence—but the hypothesis remains, even May be strengthened through testing and refinement in the face of doubt. But just as often, an assumption is wrong, in which case holding on to it is a waste of time. It would be foolish to hold on long enough to determine whether this might be true, but to hold on longer than that.
A corollary is that the stronger your intuitive preference for a hypothesis, the more rigorously you should test it—that is, try to confirm or disprove it—without distorting the results, and Justify it yourself. Be convincing. Taking the study of algorithm characteristics as an example, the following problems need to be solved.

● Will readers believe that this algorithm is new? Only the researcher conducts a literature review carefully and fully explores and explains previous relevant work. Doing so includes praising significant progress rather than overvaluing work that contributes less.
● Will readers believe that the algorithm is reasonable? It's better to explain it carefully. Potential problems should be identified and either acknowledged - for example, with an explanation of why the algorithm is not generally applicable - or dismissed with some convincing arguments.
● Are these experiments convincing? If the code isn't public enough, is there something wrong with it? Was the correct data used? Is enough data used?

Every research project raises its own skeptical questions. This type of questioning is also appropriate in future research projects, giving the author the opportunity to critically evaluate the work.

3.Form of evidence

A paper can be thought of as a collection of evidence and supporting explanations; that is, an attempt to persuade others to share your conclusions. Good science uses objective evidence to achieve a purpose, such as convincing readers to make more informed decisions and deepen their understanding of problems and solutions. In an essay, you pose a question or hypothesis and then present evidence to support your opinion.The evidence needs to be convincing because the scientific process relies on the critical and skeptical nature of the reader; the reader has no reason to be interested in a work that is inconclusive.

Broadly speaking, there are four types of evidence that can be used to support a hypothesis: proof, modeling, simulation, and experimentation.

A model is a mathematical description of a hypothesis (or some component of a hypothesis, such as an algorithm whose properties are being considered), often with evidence that the hypothesis and model do correspond.

A simulation is usually a realization or partial realization of a simplified form of a hypothesis, in which the difficulties of full realization are avoided by omission or approximation.

A great advantage of simulation is that it provides parameters that can be adjusted smoothly, allowing researchers to observe behavior over a wide range of inputs or features. For example, if you are comparing algorithms for eliminating errors in genetic data, using simulated data might allow you to control error rates and observe when different algorithms start to fail. Real data may have an unknown number of errors, or only a few different error rates, and therefore may be less informative in some sense. However, there is always a risk that simulations will be unrealistic or oversimplified, and their properties mean that the observed results will not occur in practice. Therefore, simulations are powerful tools but ultimately need to be verified against reality.

An experiment is a comprehensive test of a hypothesis based on a proposed implementation and real or highly realistic data.
In an experiment, people will have a feeling of really doing it, while in a simulation, people will only have a feeling of pretending. For example, artificial data provides a mechanism to explore behavior, but if the results are to be convincing, the corresponding behavior needs to be observed on real data.

However, in some cases the distinction between simulation and experiment can be blurred, and in principle experiments can only prove that a hypothesis holds for the specific data used; modeling and simulation can generalize conclusions (albeit imperfectly) to Other cases.

4. Use of evidence

When choosing whether to use proofs, models, simulations, or experiments as evidence, consider how persuasive each piece of evidence will be to the reader.

Choose a form of evidence not to minimize your own effort, but to be as persuasive as possible.

5.Measurement method

When you develop your research question, then, you should ask what is to be measured?
What measures will be taken?
For example, when When examining an algorithm, is it measured by execution time?
If so, what mechanism is used to measure it?
For running on a machine For a single-threaded process, this question may be difficult to answer.
There is probably no perfect answer to a distributed process that uses different resources in a network, just a range of options with various pitfalls and drawbacks, each of which you and your readers need to understand.

As another example, we could say that evidence of a claim that network quality has improved is that the average time to transmit a packet has decreased - a measurable quantity.
However, if the goal of network improvement is reduced to the goal of reducing latency, other aspects of the qualitative goal, such as the smoothness of video transmission or the effectiveness of remote location services, may be overlooked .

The advantages of well-designed experimental work are also obvious.
In computer science, the work of experimentally confirming or disproving the validity of formal studies has been historically undervalued: perhaps because of low experimental standards; perhaps because of the vast diversity of computer systems, languages, and data This makes it difficult to design truly general experiments; it may also be that the theoretical work of higher mathematics is more intellectually impactful than what some consider to be mere coding. However, many questions cannot be easily answered analytically, and a theory without practical confirmation is of no more interest to computing than to other sciences.

"Hypotheses, Questions, and Evidence" Checklist

Regarding assumptions and questions:

  • What is the phenomenon or property being studied?
  • Why are you interested in them?
  • Is the purpose of this study clear?
  • What are the specific hypotheses and research questions?
  • Do these elements convincingly relate to each other?
  • To what extent is this work innovative?
  • Is this reflected in the claim?
  • What could overturn this assumption?
  • Does it have any incredible consequences?
  • What are the underlying assumptions?
  • Are they wise?
  • Has this work been critically challenged?
  • Are you convinced this is sound science?
  • Regarding evidence and measurement, what forms of evidence should be used?
  • If it's a model or simulation, how can you prove that the results are actually valid?
  • How is the evidence measured? Are the chosen measurement methods objective, appropriate and reasonable?
  • What are qualitative goals?
  • What makes the quantitative method you choose suitable for these goals?
  • What compromises or simplifications are inherent in the metrics you choose?
  • Are the results predictive?
  • What is the argument linking the evidence to the hypothesis?
  • To what extent do positive results convincingly confirm the hypothesis?
  • Would a negative result disprove it?
  • What are some possible weaknesses or limitations of your approach?

Supongo que te gusta

Origin blog.csdn.net/Strive_LiJiaLe/article/details/134685941
Recomendado
Clasificación