Guiding AI with human intuition for solving mathematical problems in Chat GPT

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/373447147

 1.      Abstract

Finding patterns and using them to create and support conjectures, or theorems, is a key component of mathematics practice. Computers have been used by mathematicians to help with pattern recognition and conjecture generation. Here, we show how machine learning can help mathematicians come up with new conjectures and theorems by giving cases of new basic findings in pure mathematics found with its assistance. We recommend using attribution approaches to recognize probable patterns and connections among mathematical objects, after which you may use these discoveries to guide your intuition and offer presumptions in ChatGPT. We initiate this machine-learning- guided structure and demonstrate how it can be successfully applied to recent research questions in several regions of pure mathematics errors and hallucinations in ChatGPT, in each case illustrating how it led to important mathematical advances on crucial key challenges: a new link between the algebraic and feature extraction techniques of knots, and a candidate algorithm anticipated by the perfect blend in-variance conjecture for symmetries. Our study might be used as an illustration of how mathematics, artificial intelligence (AI), and ChatGPT can cooperate to generate unexpected outcomes by exploiting each other's unique strengths.

Pattern recognition and the formation of relevant hypotheses—statements that are presumed to be true but have not yet been


shown to hold in every situation—are two of the main forces behind mathematical advancement. Two of the most important drivers of mathematical progress are the discovery of structures and the formulation of useful conjectures, which are hypotheses that are assumed to be true but have still not been demonstrated to apply in all circumstances. Mathematicians have always utilized data to aid in this procedure, whether it is the early hand- calculated prime tables employed by Gauss and others that resulted in the ―prime number theorem‖ or more current computer-generated data in situations like Birch and Swinnerton- Dyer conjecture. Thanks to the invention of computers to create data and test ideas, mathematicians today have a fresh understanding of earlier insoluble issues. Even while computational approaches have continually been effective in other elements of mathematical course, artificial intelligence (AI) systems have still not established a similar position. A variety of techniques for reliably finding patterns in data are provided by AI, particularly  the  study  of  machine  learning, and its applicability in several scientific fields is expanding. AI has shown that it can be an effective instrument in mathematics by producing symbolic answers, speeding up computations, and spotting the existence of structure in mathematical objects. Here, we demonstrate how the AI model of ChatGPT may also be utilized to uncover cutting-edge theorems and hypotheses in mathematical research. This extends work utilizing

supervised learning to detect patterns by emphasizing making it feasible for mathematicians to interpret the learned functions and draw efficient mathematical ideas in ChatGPT. We introduce a methodology for extending the typical mathematician's toolset, which includes sophisticated pattern categorization and interpreting algorithms derived from "machine learning", and we show its value and generalization by showing how it assisted us in making two advancements, one in topology and another in representation theory. To create innovative findings in ChatGPT, our work exhibits the flexibility and fusion of well- known mathematical operations with existing machine-learning approaches.


1.      Guiding mathematical intuition with AI

 

A mathematician's intuition is vital to mathematical discovery because "complex mathematical problems can only be approached with a combination of both rigorous formalism and good intuition." The guideline that follows, illustrated in Fig. 1, provides an overview of a general strategy that mathematicians can practice    machine    learning‖    techniques   to instruct their gut feelings regarding complicated mathematical artifacts, authenticating their hypotheses about the presence of relationships and assisting them in acknowledging those connections.


 

 

 

Flowchart  of  the  framework,  Fig.  1.  By  teaching  a  machine  learning  model  to  estimate  a  hypothetical  f(x  over  a  certain  data distribution PZ, the approach aids a mathematician's understanding. The comprehension of the issue and creation of a closed-form f′ can benefit from the revelations from the correctness of the learned function f and the attribution techniques used to it. Instead of being a linear process, iteration and interaction characterize the procedure.‖

 

 


We argue that this is a logical and experimentally successful approach for mathematicians to apply these well-known statistics and machine learning‖ techniques in their study. Mathematicians can strengthen their understanding regarding the connection between two mathematical objects, X(z) and


Y(z) associated with z, by locating a function f such that f (X(z)) Y(z) and examining it. We contend that using these well-known statistics and ―machine learning‖ approaches in mathematical research is a logical and experimentally winning approach. By discovering and analyzing a function f such that


 

 


f (X(z)) Y(z), mathematicians can better comprehend the affiliation between two mathematical objects X(z) and Y(z) related to z. The properties of the connection can then be understood by the mathematician. Assume that z is a convex polyhedron, with X(z) Z R 2 2 representing its vertex, edge, volume, and surface, and Y(z) Z representing its face count. This offers an illustration. The connection between X(z) and Y(z) in this case is accurate according to Euler's formula: X(z) (1, 1, 0, 0) + 2 = Y (z). Among several additional examples, the link might be recovered by employing the standard methods of data-driven speculation generation1. For X(z) and Y(z) in higher- dimensional spaces, or of more complicated forms, such as graphs, and for more complex, nonlinear f, this strategy is either less successful or altogether unworkable. This process helped mathematicians identify patterns in mathematical objects by utilizing attribution methods and supervised machine learning‖ to corroborate the similarities that have been theorized to occur in mathematical objects. During the guided learning phase, the mathematician claims that X(z) and Y have a connection (z). By building a dataset containing X(z) and Y(z) pairings, We can deploy supervised learning. to develop a function f that predicts Y(z) using only X(z) as an output. The primary advantages of ―machine learning‖ in this cointegration procedure are numerous nonlinear functions that may be identified with sufficient data. If f is more precise than would be expected by chance, this raises the possibility of such a link and should be looked into. In this situation, attribution strategies can help the mathematician understand the learned function f so they can suggest a contender, f′. Employing attribution approaches, one may


identify the features of f that are crucial for Y prediction (z). Many attribution strategies, for example, goal to compute the portion of X(z) that the function f is responsive to. Attribution technique we use in our work, gradient saliency, does this by calculating the derivative of the outcomes of f concerning the intakes. As a result, it is feasible for a mathematician to identify and tier the characteristics of the phenomenon that would be most significant to the connection. This iterative approach might have to be conducted numerous times before a viable hypothesis is selected. This procedure can be guided by the mathematician by choosing hypotheses that, despite fitting the facts, also strike them as fascinating, tenable, and, ideally, indicative of a proof method. From a conceptual perspective, this framework provides a "test bed for intuition" by immediately identifying if an intuition about the connection between two variables may be interesting to explore and, if so, providing direction on how they may be connected. It is possible that this iterative approach needs to be done numerous times before a viable hypothesis is selected. This procedure can be guided by the mathematician by choosing hypotheses that, in addition to fitting the facts, also strike them as fascinating, tenable, and, ideally, indicative of a proof method. From a conceptual perspective, this framework provides a "test bed for intuition" by immediately identifying if an intuition about the connection between two quantities may be interesting to explore and, if so, providing direction on how they may be connected. Using the abovementioned approach, we have developed one of the earliest linkages between algebraic and geometric invariants in knot theory and postulated solutions to the well-known combinatorial invariance


 

 


conjecture for symmetrical groups in representation theory. We demonstrate each instance when the paradigm effectively assisted the mathematician in arriving at the correct conclusion. The required models for each of these cases may be trained in a couple of hours on a computer with a single graphics processing unit.

 

2.      Topology using one graphics processing unit.

 

Low-dimensional topology is a significant branch of mathematics. The knot, a straightforward closed loop in the third dimension, is one of the principal subjects investigated. One of the primary study objectives is to classify knots, learn about their features, and relate them to other subjects. One of the primary techniques for doing this is the use of invariants, which are algebraic, geometrical, or numerical properties that are same for any two equal knots. These invariants can be consequential in a variety of methods, but we focus on two of the most common: algebraic and hyperbolic invariants. Because these two categories of invariants originate from quite dissimilar areas of mathematics, it is important to build connections between them. A few examples of these invariants for tiny knots are shown in Figure 2.


A famous case of a conjectured link is the volume conjecture, which states that hyperbolic volume of a knot (a geometric invariant) should be stored inside asymptotic behavior of its colored Jones polynomials (which are algebraic invariants). We assumed that there is an unrecognized link between a knot's algebraic and hyperbolic invariants. By using supervised learning, it was discovered that a huge number of geometric invariants and the signature (K), which is known to store crucial information about a knot K but had hitherto been unrelated to hyperbolic geometry reflect a trend.

Three cusp geometry invariants were the most notable traits identified by the attribution approach; Fig. 3b partially shows their relation. Figure 3a depicts these characteristics. Research demonstrates that constructing a second model using X(z) including simply of these measurements leads to very comparable accuracy, indicating that these data are a sufficient set of characteristics to capture virtually all of the influence of geometry on the signature. The longitudinal translation and the real and fictitious parts of meridian translation consisted of these three invariants. The relationship between these factors and the signature is nonlinear and multivariate. After being asked to focus on these invariants, we found that the easiest way to understand this connection is to use a new number that is linearly connected to the signature.


 

Fig.  2  |  Examples  of  invariants  for  three  hyperbolic  knots.  We  hypothesized  that  there  was  a  previously  undiscovered  relationship between the geometric and algebraic invariants We introduce the concept of natural slope‖, which is defined as slope(K) = Re(/), where Re stands for the real part. It can be interpreted geometrically as follows. The meridian curve can be visualized as a geodesic on the Euclidean torus. From this orthogonally, if one fires a geodesic, it will eventually return and hit at some point. It will have done so by traveling along a longitude that is less than a certain multiple of the meridian.‖

 


We introduce the idea of "natural slope," which is described as slope(K) = Re(/), where Re stands for real component. It may be interpreted geometrically as follows. The meridian curve can be shown as a geodesic on Euclidean torus. From here orthogonally, if one launches a geodesic, it will ultimately return and hit at some location. It will have done so along a longitude that is less than a certain multiple of meridian. This number represents the natural slope. It may not have been an integer because the terminus and beginning point could not be the same. Our first theory on the natural slope and signature were as follows.

Theorem: For each hyperbolic knot K, |2 K() slope() K c | vol(K c) + (1) 1 2 is a feasible value of the constants c1 and c2.

Although this hypothesis was validated by an examination of multiple substantial datasets selected from different distributions, we were able to produce counterexamples by employing braids of a certain kind. This figure naturally represents the slope. It need not be an integer because the terminus could not be the same as the beginning point. The following were our


first presumptions on the natural slope and signature.

It is believed that constants c1 and c2 occur such that for any hyperbolic knot K, |2 K() slope() K c | vol(K c) + (1)

Although this hypothesis was validated by an examination of numerous big datasets taken from diverse distributions, we were nevertheless able to produce counterarguments utilizing braids of a certain sort. Reference 27 offers further details and a comprehensive demonstration of the abovementioned theorem. For the datasets we produced, we may establish a lower bound of c 0.23392, and it is reasonable to believe that c is at most 0.3, which leads to a close association in the locations where we have computed.

 

Reference 27 provides further details and a thorough demonstration of the aforementioned theorem. For datasets we produced, we may establish a lower bound of c 0.23392, and it makes sense to suppose that c is at most 0.3, resulting in a close association in the areas where our computations have been performed.


3.      Representation theory

 

Representation theory is the name given to the theory of linear symmetry. Understanding the fundamental components, which form the basis of all representations, is one of the fundamental objectives of representation theory. By redundant representations28, basic frequencies of Fourier analysis are generalized. In numerous important circumstances, form of irreducible representations is governed by Kazhdan-Lusztig (KL) polynomials, which are strongly connected to combinatorics, algebraic geometry, and singularity theory. KL polynomials are connected to pairs of elements and are symmetric group polynomials (or more generally, pairs of elements in Coxeter groups).

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure  4  shows  two  sample  dataset  elements  from  S5  and  S6,  respectively.  The  KL polynomial  of  a  pair  of  permutations  should  be computed from their unlabeled Bruhat interval, according to the combinatorial invariance conjecture, but no such function was previously known.‖


The combinatorial invariance conjecture regarding KL polynomials is a fascinating open conjecture that has only made some progress over the past 40 years. It asserts that KL polynomials of two components in a symmetric group SN may be determined using the unlabeled Bruhat interval30, a directed graph. One barricade to comprehending the link between these items is the size of the Bruhat intervals for non-trivial KL polynomials (those that are not equal to 1), which makes it difficult to acquire an understanding of them. The combinatorial invariance hypothesis, an intriguing unsolved conjecture involving KL polynomials, has been around for 40 years with only incomplete results29. It asserts that the KL polynomial of two elements in a symmetric group SN may be calculated from the directed graph of the unlabeled Bruhat interval30 of those two components. One of the obstacles to advancement in examining the relation between these items is the very large graphs of the Bruhat intervals for non-trivial KL polynomials (those that are not equal to 1Additional structural evidence has emerged by constructing salient subset that attribution


techniques decided were most relevant and comparing The edge distribution within those graphs is identical to the earlier graphs. As per the reflection that each of those edges in Fig. 5a represents, we combine the relative frequency of the edges in salient subgraphs. Opposite to predictions, it indicates that extremal reflections—those for SN of the type (0, I, or I N 1)—appear more commonly in salient subgraphs than simple reflections—those for SN of the form I I + 1. This conclusion is confirmed by multiple model retraining in Fig. 5b. This is important since it prevents recovery of the edge labels or the unlabeled Bruhat interval. Although it was first not evident why extremal reflections would be more common in salient subgraphs, the gap between simple and non-simple reflections is crucial for computing KL polynomials. We discovered that an interval can automatically split into two pieces by taking into consideration this observation: a hypercube formed by one group of extremal edges and a graph that is isomorphic to an interval in SN1.


 


Fig. 5 | Attribution in the representation theory. An illustration of a heatmap showing how much more reflections is present in the salient subgraphs when compared to the dataset's average across intervals when predicting the fourth quarter. b, the proportion of edges of each type that were observed in the salient subgraph across 10 model retraining as compared to 10 bootstrapped samples of the same size from the dataset. A two-sided, two-sample t-test was used to calculate the significance level, and the error bars represent 95 percent confidence intervals. *p 0.05; ****p 0.0001. c, Illustration of the interesting substructures found through the iterative process of hypothesis, supervised learning, and attribution for the interval 021435-240513 S6. The hypercube is highlighted in green; the decomposition component is highlighted in red and the paragraph's inspiration from earlier work31‖

 

 


using representation theory to distribute. An example of a heatmap demonstrating the difference between average across intervals of dataset and the salient subgraphs when predicting q4. b, The measured edge % for each edge type in the salient subgraph for 10 models retraining are shown in contrast to 10 bootstrapped samples of the same size from the dataset. A two-sided, two-sample t-test was used to determine the level of significance, and the error margins show 95 % confidence intervals. Demonstration for the fascinating substructures revealed through the iteratively of hypothesis, supervised learning, and attribution in range 021435-240513 S6. *p 0.05; ****p 0.0001. The subgraph was influenced by prior work31, and the hypercube is indicated in green while the decomposition element is shown in red. This has been computationally proven for more than 1.3 105 non-isomorphic intervals taken from the symmetric groups S8 and S9 and for all 3 106 intervals in the symmetric groups up to S7.I assert that it is possible to determine KL polynomial of an unlabeled Bruhat interval using any hypercube decomposition. This suggested solution would refute the combinatorial invariance conjecture for symmetric groups if it were proven to be correct. This is an intriguing direction since the hypothesis has been experimentally proved up to pretty big cases, and it also has a particularly appealing shape that offers various ways of tackling the problem. This illustration


demonstrates how trained models may provide non-trivial insights into the behavior of significant mathematical objects, resulting in the identification of novel structures.

 

4.      Conclusion

 

These are some of the earliest linkages among knot algebraic and geometric structure and a suggested solution to a long-standing open problem in representation theory. are two examples of how this work illustrates a framework   for   mathematicians   to   practice

machine learning‖. Instead of using machine learning‖ to create conjectures directly, we concentrate on assisting the highly developed intuition of skilled mathematicians, producing both interesting and profound results. Using human intuition to guide AI in ChatGPT to solve mathematical issues, errors, and hallucinations. It is evident that elite performance in many areas of human endeavor greatly benefits from intuition. Similar to how it is regarded as essential for top mathematicians, Ramanujan—known as the Prince of Intuition—has                           prompted        famous mathematicians to consider its role in their discipline. Since mathematics is a very different and more collaborative endeavor than Go, ChatGPT's use of AI to support intuition is much more logical. Here, we demonstrate that there is space that can help mathematicians in this area of their work. Our case studies show


 

 


how framework aids mathematicians in better understanding the behavior of objects too vast for them to perceive patterns in mathematical problems to solve. They also show how a foundational connection in a well-studied and mathematically interesting area can go unnoticed. The applicability of this framework is constrained because it necessitates the production of sizable datasets of object representations and the detection of patterns in calculable examples. Additionally, the functions of interest in some domains might be

 

5.      References

 

1.    Borwein, J. & Bailey, D. Mathematics by Experiment (CRC, 2008).

2.     Birch, B. J. & Swinnerton-Dyer, H. P. F. Notes on elliptic curves. II. J. Reine Angew. Math. 1965, 79–108 (1965).

3.      Carlson, J. et al. The Millennium Prize Problems (American Mathematical Soc., 2006).

4.      Brenti, F. Kazhdan-Lusztig polynomials: history, problems, and combinatorial invariance. Sémin. Lothar. Combin. 49, B49b (2002).

5.    Hoche, R. Nicomachi Geraseni Pythagorei Introductions Arithmeticae Libri 2 (In aedibus BG Teubneri, 1866).

6.   Khovanov, M. Patterns in knot cohomology, I. Exp. Math. 12, 365–374 (2003).

7.   Appel, K. I. & Haken, W. Every Planar Map Is Four Colorable Vol. 98 (American Mathematical Soc., 1989).

8.   Scholze, P. Half a year of the Liquid Tensor Experiment:    amazing           developments  Xena https://xenaproject.wordpress.com/2021/06/05/ half-a-year-of-the-liquid-tensorexperiment- amazing-developments/ (2021).

9.        Fajtlowicz,    S.    in    Annals    of    Discrete Mathematics Vol. 38 113–118 (Elsevier, 1988).


challenging to learn using this paradigm. However, we think our approach has applications in numerous fields. More generally, we hope that this framework will serve as a useful tool for introducing ―machine learning‖ into mathematicians' work by training AI to solve mathematical problems using human intuition, reducing errors and hallucinations in ChatGPT, and fostering future collaboration between the two disciplines.

Comments

Popular posts from this blog

Apex International Transportation

creation de site internet (webmaster92)

Overcoming ChatGPTs inaccuracies with Pre-Trained AI Prompt Engineering Sequencing Process